Refactor test cases in feature engineering and preprocessing for impr… by vuductung · Pull Request #6 · MannLabs/QuantSelect

vuductung · 2025-03-20T08:52:22Z

No description provided.

…oved clarity and functionality - update test methods and remove deprecated tests

github-actions · 2025-03-20T08:54:03Z

tests/test_featureengineering.py

-                [0.5, 4.0],
-            ]
-        )
+        expected_output = np.array([[0.5, 1.0], [0.5, 1.0]])


The expected output has been changed to match the actual implementation. The original test was expecting mean distances of [0.5, 4.0] but the correct expected values are [0.5, 1.0] based on the implementation.

expected_output = np.array([[0.5, 1.0], [0.5, 1.0]])

github-actions · 2025-03-20T08:54:17Z

tests/test_featureengineering.py

+        result = self.helper._calculate_mean_distance(input)
        np.testing.assert_array_equal(result, expected_output)

    def test_feature_engineering_mean(self):


The method name has been corrected from calculate_mean_distance to _calculate_mean_distance to match the actual implementation in the code, as indicated by the leading underscore.

result = self.helper._calculate_mean_distance(input)

github-actions · 2025-03-20T08:54:28Z

tests/test_featureengineering.py

-        self.assertTrue(np.array_equal(result_axis_1, self.d))

    def test_feature_engineering_SNR(self):
        result_axis_0 = self.helper.feature_engineering(self.b, axis=0, func="SNR")


The test for rank_intensity on axis=1 has been removed as it was failing. The implementation likely doesn't support or correctly handle rank_intensity on axis=1.

github-actions · 2025-03-20T08:54:30Z

tests/test_pseudogroundtruth.py

@@ -119,7 +119,7 @@ def test_calculate_y_pseudo_input_diffs(self):
        )
        expected_result = torch.tensor(
            [


Fixed a precision issue in the test. The original test was looking for a very small number (2.3842e-07) but the correct expectation is 0.0, as the implementation likely rounds or handles such small values as zero.

[0.0, 0.0, 0.0],

github-actions · 2025-03-20T08:54:32Z

tests/test_preprocessing.py

        self.assertTrue(torch.allclose(result, expected_result, rtol=1e-04))

-    def test_initial_alignment(self):
+    def test_initial_alignment_for_list_of_arrays(self):


The test has been renamed and modified to test initialization with a list of arrays rather than a single array. This is more comprehensive and tests a different code path.

def test_initial_alignment_for_list_of_arrays(self): " Test the initial_alignment " # Create a sample input data data = [ np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), np.array([[7, 8, 9], [4, 5, 6], [1, 2, 3]]), ]

github-actions · 2025-03-20T08:54:34Z

tests/test_preprocessing.py

        # Expected output after initial alignment
-        expected_output = np.array([[4, 5, 6], [4, 5, 6], [4, 5, 6]])
+        expected_output = [
+            np.array([[4, 5, 6], [4, 5, 6], [4, 5, 6]]),


Updated expected output to match the list of arrays input structure. Each array in the list should be aligned to its median values.

# Expected output after initial alignment expected_output = [ np.array([[4, 5, 6], [4, 5, 6], [4, 5, 6]]), np.array([[4, 5, 6], [4, 5, 6], [4, 5, 6]]), ]

github-actions · 2025-03-20T08:54:36Z

tests/test_preprocessing.py

-        # Assert the expected result
-        pd.testing.assert_frame_equal(
-            np.isnan(result), np.isnan(self.expected_result_for_normalization)
+    def test_align_by_level_first_dataframe(self):


Test method was renamed to be more specific. Now it tests aligning the first dataframe in a list by level.

def test_align_by_level_first_dataframe(self):

github-actions · 2025-03-20T08:54:38Z

tests/test_preprocessing.py

+                [("A", 1234), ("B", 1223), ("A", 5211), ("B", 1221)],
+                names=["pg", "mod_seq_charge_hash"],
+            ),
        )


Test was modified to use a list of dataframes as input, rather than a single dataframe, to match the updated implementation.

input_data = [frag_level_intensity_data, frag_level_intensity_data]

github-actions · 2025-03-20T08:54:39Z

tests/test_preprocessing.py

-        # Create sample data
+        input_data = [frag_level_intensity_data, frag_level_intensity_data]

        # Call the method under test


The method call was updated to pass a list of dataframes, specify the level to align by, and indicate which indices to modify through alignment.

result = self.preprocessing.align_by_level(input_data, level="pg", indices=[0])

github-actions · 2025-03-20T08:54:41Z

tests/test_preprocessing.py

                names=["pg", "mod_seq_charge_hash"],
            ),
        )
+        input_data = [frag_level_intensity_data, frag_level_intensity_data]


Similar to the previous change, the method call was updated to match the new signature with required parameters.

result = self.preprocessing.align_by_level(input_data, level="pg", indices=[0])

github-actions · 2025-03-20T08:54:43Z

tests/test_preprocessing.py

@@ -392,19 +385,14 @@ def test_get_non_regularatory_pg_level_intensity_data(self):

    def test_get_values_to_shift_for_normalisation_by_sample(self):


Improved the DataFrame creation syntax to directly set the index during initialization rather than after creation, making the code more concise and clearer.

pg_level_non_reg_intensity_data = pd.DataFrame( data={ "sample1": [1, 1, 2, 1], "sample2": [10, 11, 14, 15], "sample3": [1, 1, 1, 4], "sample4": [1, 3, 1, 1], }, index=["A", "B", "C", "D"], )

github-actions · 2025-03-20T08:54:45Z

tests/test_preprocessing.py

-                [("A", 1234), ("B", 1223), ("A", 5211), ("B", 1221)],
-                names=["pg", "mod_seq_charge_hash"],
-            ),
+        frag_level_intensity_data = np.array(


The test was simplified to use NumPy arrays instead of pandas DataFrames, and the normalization function signature was updated to take a no_const parameter instead of a Series of values to shift.

frag_level_intensity_data = np.array( [ [1, 2, 2, 1], [1, 3, 1, 3], [2, 14, 21, 1], [3, 2, 4, 2], ] ) expected_output = np.array( [ [2, 2, 2, 1], [2, 3, 1, 3], [3, 14, 21, 1], [4, 2, 4, 2], ] )

github-actions · 2025-03-20T08:54:51Z

tests/test_preprocessing.py

+        np.testing.assert_array_equal(result, expected_output)

    def test_init_align_with_list_of_arrays(self):
        # create input data


The method call was updated to match the new signature which takes a no_const parameter instead of values_to_shift.

result = self.preprocessing.normalize_by_sample( frag_level_intensity_data, no_const=3 )

github-actions · 2025-03-20T08:55:05Z

tests/test_preprocessing.py

+        np.testing.assert_array_equal(result, expected_output)

    def test_init_align_with_list_of_arrays(self):
        # create input data


Changed the assertion to use NumPy's array_equal instead of pandas' frame_equal to match the updated data types.

np.testing.assert_array_equal(result, expected_output)

github-actions · 2025-03-20T08:55:07Z

Several test methods were completely removed across multiple files. These include tests for weighted variance calculation, specific loss functions (MSE/MAE), coefficient of variation, and other methods. This suggests that either these functions were removed from the implementation or their interfaces have changed significantly. It would be good to either update these tests to match the current implementation or document why they were removed.

github-actions · 2025-03-20T08:55:08Z

Number of tokens: input_tokens=25769 output_tokens=2276 max_tokens=4096
review_instructions=''
config={}
thinking: ```
[]

…ring - Implement tests for various scenarios including perfect correlation, edge cases, and shape discrepancies. - Ensure comprehensive coverage of the function's behavior with NaN values and different reference shapes.

- Simplify input array structure in the test case for better readability. - Correct expected output to match the intended result of the mean distance calculation.

ammarcsj

LGTM

ammarcsj · 2025-03-20T13:08:08Z

tests/test_featureengineering.py

        self.assertTrue(np.allclose(result_axis_0, self.h))
        self.assertTrue(np.allclose(result_axis_1, self.i))

+    def test_nan_correlation_w_ref(self):


What do you mean by nan correlation? Maybe short docstring

jap, will do, nan_correlation just means pearson correlation calculation while masking union of missing values.

ah ok, maybe then pearson_correlation_nanmasked or something like this would be more expressive?

Refactor test cases in feature engineering and preprocessing for impr…

2250ab4

…oved clarity and functionality - update test methods and remove deprecated tests

vuductung added the code-review label Mar 20, 2025

vuductung requested review from ammarcsj and mschwoer March 20, 2025 08:52

github-actions bot reviewed Mar 20, 2025

View reviewed changes

vuductung added 3 commits March 20, 2025 10:21

Refactor test_calculate_mean_distance for improved clarity and accuracy

d807a35

- Simplify input array structure in the test case for better readability. - Correct expected output to match the intended result of the mean distance calculation.

add more tests

fea6bc4

ammarcsj approved these changes Mar 20, 2025

View reviewed changes

vuductung merged commit aa172a7 into main Mar 20, 2025
0 of 6 checks passed

		@@ -392,19 +385,14 @@ def test_get_non_regularatory_pg_level_intensity_data(self):

		def test_get_values_to_shift_for_normalisation_by_sample(self):

Comments

Conversation

vuductung commented Mar 20, 2025

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 20, 2025

Uh oh!

github-actions bot commented Mar 20, 2025

Uh oh!

ammarcsj left a comment

Choose a reason for hiding this comment

Uh oh!

ammarcsj Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

vuductung Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

ammarcsj Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants