Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add harmonic mean combination #238

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Jul 31, 2023

Description

Adding harmonic mean technique (it's a reciprocal of the arithmetic mean of the reciprocals, more details here ). It does support weighted mean similarly to arithmetic mean. Example of pipeline with processor config:

{
    "description": "Post processor for hybrid search",
    "phase_results_processors": [
        {
            "normalization-processor": {}, 
            "combination": {
                 "technique": "harmonic_mean",
                  "parameters": {
                        "weights": [
                            0.4, 0.7
                        ]
                    }
             }
        }
    ]
}

Issues Resolved

#228, part of solution for #126

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@codecov
Copy link

codecov bot commented Jul 31, 2023

Codecov Report

Merging #238 (2527fee) into feature/normalization (6ad641a) will increase coverage by 3.59%.
The diff coverage is 93.93%.

❗ Current head 2527fee differs from pull request most recent head 8936d5d. Consider uploading reports for the commit 8936d5d to get more accurate results

@@                     Coverage Diff                     @@
##             feature/normalization     #238      +/-   ##
===========================================================
+ Coverage                    82.43%   86.02%   +3.59%     
- Complexity                     323      331       +8     
===========================================================
  Files                           26       27       +1     
  Lines                          979      959      -20     
  Branches                       153      150       -3     
===========================================================
+ Hits                           807      825      +18     
+ Misses                         108       69      -39     
- Partials                        64       65       +1     
Files Changed Coverage Δ
...processor/combination/ScoreCombinationFactory.java 100.00% <ø> (ø)
...ination/HarmonicMeanScoreCombinationTechnique.java 94.11% <88.88%> (+94.11%) ⬆️
...combination/AbstractScoreCombinationTechnique.java 95.45% <95.45%> (ø)
...ation/ArithmeticMeanScoreCombinationTechnique.java 89.47% <100.00%> (-2.84%) ⬇️

@@ -52,7 +38,7 @@ public float combine(final float[] scores) {
for (int indexOfSubQuery = 0; indexOfSubQuery < scores.length; indexOfSubQuery++) {
float score = scores[indexOfSubQuery];
if (score >= 0.0) {
float weight = getWeightForSubQuery(indexOfSubQuery);
float weight = getWeightForSubQuery(this.weights, indexOfSubQuery);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename local variable name to a different one to avoid future mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
/**
* Base class for score normalization technique
*/
public abstract class AbstractScoreCombinationTechnique {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing this class as an abstract class, for me its more like having some utility functions. All we are doing here is just doing some utility things. I would rather move this class as a utility class like ScoreCombinationUtility.

Also, the function: getSupportedParams, seems best for an interface and not for an abstract class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Utility class was my second option, my concern was that those methods will be visible. We can put utility class into same package and at least not make methods private. Ack for the method with supported params, it should go to the interface especially if we'll get rid of the abstract class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, exposing those utility methods is not a big concern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored abstract class into util class as suggested

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make the utility class as package private. I think thats a good thing to do.

@martin-gaievski
Copy link
Member Author

CI will fail for some time until compilation errors are fixed in knn, related PR opensearch-project/k-NN#1014

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski force-pushed the add-harmonic-mean-weighted-geometric branch from a8ba420 to 3767707 Compare August 1, 2023 17:18
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski merged commit 1f67b94 into opensearch-project:feature/normalization Aug 1, 2023
8 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants