[ENH] convert sample sizes #919

jdkent · 2025-03-11T01:25:24Z

references: https://neurostars.org/t/using-neurosynth-compose-and-annotations/32191/2

Changes proposed in this pull request:

add sample_sizes as an analysis metadata variable that can be used to transfer over to a NiMARE dataset

Summary by Sourcery

Adds support for 'sample_sizes' as an analysis metadata variable that can be used to transfer over to a NiMARE dataset.

Enhancements:

Adds validation for sample sizes to ensure they are in the correct format (list, tuple, int, or float).
Improves error handling and type checking for annotations to ensure they are dictionaries.

sourcery-ai · 2025-03-11T01:25:29Z

Reviewer's Guide by Sourcery

This pull request introduces the ability to use sample_sizes as an analysis metadata variable when converting NiMADS data to a NiMARE dataset. It also includes validation and error handling improvements for sample sizes and annotations. The implementation involves modifying the _analysis_to_dict function in nimare/io.py to handle sample_sizes and adding a new test case in nimare/tests/test_io.py to verify the functionality.

Sequence diagram for converting analysis to dictionary with sample sizes

sequenceDiagram
  participant Study
  participant Analysis
  participant Result

  Analysis->>Analysis: Get sample_sizes from metadata
  alt sample_sizes is None or empty
    Analysis->>Analysis: Get sample_size from analysis metadata
    alt sample_size is None
      Analysis->>Study: Get sample_size from study metadata
    end
  end
  Analysis->>Result: Add sample_sizes to result metadata
  Result-->>Analysis: Return result

Updated class diagram for Analysis metadata handling

classDiagram
  class Study {
    metadata: dict
  }
  class Analysis {
    metadata: dict
    annotations: dict
  }
  class Result {
    metadata: dict
    labels: dict
  }
  Analysis --|> Study : Gets sample_size from
  Analysis --|> Result : Adds sample_sizes to
  Analysis --|> Result : Adds annotations to
  note for Analysis "Handles sample_sizes and annotations from metadata"

File-Level Changes

Change	Details	Files
Added support for `sample_sizes` as an analysis metadata variable when converting NiMADS to a NiMARE dataset.	Added validation to ensure `sample_sizes` is a list or tuple. Added logic to extract a single `sample_size` from analysis or study metadata if `sample_sizes` is not available. Added validation to ensure a single `sample_size` is numeric. Added `sample_sizes` to the result metadata. Added a test case to verify the correct conversion of `sample_sizes` from NiMADS to a NiMARE dataset.	`nimare/io.py` `nimare/tests/test_io.py`
Improved error handling for annotations.	Added a check to ensure that each annotation is a dictionary. Improved the error message when an invalid annotation format is encountered.	`nimare/io.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!
Generate a plan of action for an issue: Comment @sourcery-ai plan on
an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @jdkent - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider consolidating the sample size handling logic to reduce redundancy.
It might be good to add a test case that covers the scenario where sample_size is in study metadata.

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟡 Testing: 2 issues found
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-03-11T01:26:39Z

nimare/io.py

+
+        # Validate sample sizes if present
+        if sample_sizes is not None and not isinstance(sample_sizes, (list, tuple)):
+            raise TypeError(


suggestion (bug_risk): Review broad exception handling for annotations.

The try/except block around updating the 'labels' dictionary catches TypeError and AttributeError broadly and re-raises them as ValueError. This might mask issues that are not related to annotation formatting. Consider narrowing the exception scope to only the operations you expect might fail.

sourcery-ai · 2025-03-11T01:26:39Z

nimare/tests/test_io.py

+    dset = io.convert_nimads_to_dataset(studyset)
+
+    assert isinstance(dset, nimare.dataset.Dataset)
+    assert "sample_sizes" in dset.metadata.columns


suggestion (testing): Add tests for edge cases and error handling.

It would be beneficial to include tests that cover edge cases and error conditions within the _analysis_to_dict function, such as when sample_sizes is not a list/tuple, sample_size is not numeric, or when annotations have an invalid format. This ensures that the error handling logic is robust and functions as expected.

Suggested implementation:

def test_convert_nimads_to_dataset_sample_sizes( example_nimads_studyset, example_nimads_annotation ): """Conversion of nimads JSON to nimare dataset.""" studyset = Studyset(example_nimads_studyset) for study in studyset.studies: for analysis in study.analyses: analysis.metadata["sample_sizes"] = [2, 20]

import pytest from nimare import io, dataset # adjust imports as needed from nimare.dataset import Dataset from nimare.io import _analysis_to_dict # assuming _analysis_to_dict is accessible for testing

def test_analysis_to_dict_invalid_sample_sizes_type(example_nimads_studyset): """Test _analysis_to_dict raises ValueError when sample_sizes is not a list/tuple.""" studyset = Studyset(example_nimads_studyset) # Set sample_sizes to an int rather than list/tuple for study in studyset.studies: for analysis in study.analyses: analysis.metadata["sample_sizes"] = 5 with pytest.raises(ValueError): # Trigger conversion which internally calls _analysis_to_dict io.convert_nimads_to_dataset(studyset) def test_analysis_to_dict_non_numeric_sample_sizes(example_nimads_studyset): """Test _analysis_to_dict raises ValueError when sample_sizes contains non-numeric values.""" studyset = Studyset(example_nimads_studyset) # Set sample_sizes to a list with non-numeric value for study in studyset.studies: for analysis in study.analyses: analysis.metadata["sample_sizes"] = [2, "twenty"] with pytest.raises(ValueError): io.convert_nimads_to_dataset(studyset) def test_analysis_to_dict_invalid_annotations_format(example_nimads_studyset): """Test _analysis_to_dict raises ValueError when annotations are in an invalid format.""" studyset = Studyset(example_nimads_studyset) # Here we assume that the annotation is expected to be a dict # Set annotation to an invalid format (e.g., a string) for study in studyset.studies: for analysis in study.analyses: analysis.metadata["annotations"] = "invalid_format" with pytest.raises(ValueError): io.convert_nimads_to_dataset(studyset)

These tests assume that the underlying _analysis_to_dict function (or the conversion process in io.convert_nimads_to_dataset) performs robust type checking and raises a ValueError when encountering:

sample_sizes that is not a list/tuple,

sample_sizes with non-numeric values, and

annotations with an unexpected format.
You may need to adjust the exception type or the input modifications to match your actual implementation of error handling.
Also, ensure that the test file has access to the Studyset class and the necessary fixture definitions (example_nimads_studyset, example_nimads_annotation) or import them as needed.

sourcery-ai · 2025-03-11T01:26:39Z

nimare/tests/test_io.py

+    studyset = Studyset(example_nimads_studyset)
+    for study in studyset.studies:
+        for analysis in study.analyses:
+            analysis.metadata["sample_sizes"] = [2, 20]


suggestion (testing): Add test for single sample size.

Consider adding a test case where the sample_size is a single value instead of a list to ensure both code paths are tested.

Suggested implementation:

def test_convert_nimads_to_dataset_single_sample_size( example_nimads_studyset, example_nimads_annotation ): """Test conversion of nimads JSON to nimare dataset with a single sample size value.""" studyset = Studyset(example_nimads_studyset) for study in studyset.studies: for analysis in study.analyses: analysis.metadata["sample_sizes"] = 20 dset = io.convert_nimads_to_dataset(studyset) assert isinstance(dset, nimare.dataset.Dataset) assert "sample_sizes" in dset.metadata.columns def test_convert_sleuth_to_dataset_smoke():

Make sure your test runner picks up the new test function. You may need to adjust the placement of the new test according to your project's test suite conventions.

sourcery-ai · 2025-03-11T01:26:39Z

nimare/io.py

@@ -62,13 +62,45 @@
                "z": [p.z for p in analysis.points] or [None],
            },
        }
-        sample_size = study.metadata.get("sample_size")


issue (complexity): Consider extracting the inline sample size and annotation validations into helper functions to improve readability and reduce nesting.

Consider extracting the inline sample size and annotation validations into small helper functions. This will reduce nesting and improve readability. For example, you could do the following: ```python def validate_sample_sizes(analysis_metadata, study_metadata): sample_sizes = analysis_metadata.get("sample_sizes") if sample_sizes is not None and not isinstance(sample_sizes, (list, tuple)): raise TypeError( f"Expected sample_sizes to be list or tuple, but got {type(sample_sizes)}" ) if not sample_sizes: sample_size = analysis_metadata.get("sample_size") or study_metadata.get("sample_size") if sample_size is not None and not isinstance(sample_size, (int, float)): raise TypeError( f"Expected sample_size to be numeric, but got {type(sample_size)}" ) return [sample_size] if sample_size is not None else None return sample_sizes def validate_annotations(annotations): labels = {} for annotation in annotations.values(): if not isinstance(annotation, dict): raise TypeError( f"Expected annotation to be dict, but got {type(annotation)}" ) labels.update(annotation) return labels

Then simplify your main code as follows:

sample_sizes_validated = validate_sample_sizes(analysis.metadata, study.metadata) if sample_sizes_validated: try: result["metadata"]["sample_sizes"] = sample_sizes_validated except TypeError as e: raise TypeError(f"Error converting sample size data to list: {str(e)}") from e if analysis.annotations: try: result["labels"] = validate_annotations(analysis.annotations) except (TypeError, AttributeError) as e: raise ValueError(f"Invalid annotation format: {str(e)}") from e

These changes isolate the validation logic into focused units, reducing complexity while preserving functionality.

codecov · 2025-03-11T16:06:43Z

Codecov Report

Attention: Patch coverage is 68.18182% with 7 lines in your changes missing coverage. Please review.

Project coverage is 88.15%. Comparing base (895c154) to head (42b749c).

Files with missing lines	Patch %	Lines
nimare/io.py	68.18%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #919      +/-   ##
==========================================
- Coverage   88.23%   88.15%   -0.08%     
==========================================
  Files          48       48              
  Lines        6383     6400      +17     
==========================================
+ Hits         5632     5642      +10     
- Misses        751      758       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jdkent added 2 commits March 10, 2025 20:22

allow for sample sizes to be converted

694ddfe

formatting

42b749c

sourcery-ai bot reviewed Mar 11, 2025

View reviewed changes

add more tests

ed9dedb

jdkent merged commit b838dbb into neurostuff:main Mar 11, 2025
8 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] convert sample sizes #919

[ENH] convert sample sizes #919

jdkent commented Mar 11, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Mar 11, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Mar 11, 2025

sourcery-ai bot Mar 11, 2025

sourcery-ai bot Mar 11, 2025

sourcery-ai bot Mar 11, 2025

codecov bot commented Mar 11, 2025

[ENH] convert sample sizes #919

[ENH] convert sample sizes #919

Conversation

jdkent commented Mar 11, 2025 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Mar 11, 2025 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for converting analysis to dictionary with sample sizes

Updated class diagram for Analysis metadata handling

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Mar 11, 2025

Choose a reason for hiding this comment

sourcery-ai bot Mar 11, 2025

Choose a reason for hiding this comment

sourcery-ai bot Mar 11, 2025

Choose a reason for hiding this comment

sourcery-ai bot Mar 11, 2025

Choose a reason for hiding this comment

codecov bot commented Mar 11, 2025

Codecov Report

jdkent commented Mar 11, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Mar 11, 2025 •

edited

Loading