Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dependency on k-NN plugin #10

Merged
merged 1 commit into from
Oct 6, 2022
Merged

Add dependency on k-NN plugin #10

merged 1 commit into from
Oct 6, 2022

Conversation

jmazanec15
Copy link
Member

Description

Adds dependency on k-NN in the build.gradle file. The dependency resolution will work by first pulling the opensearch-knn zip from Maven and then unzipping it and adding it to the classpath.

Added a unit test to confirm that it works.

Issues Resolved

#9

Check List

  • New functionality includes testing.
    • All tests pass
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Adds dependency on k-NN in the build.gradle file. The dependency
resolution will work by first pulling the opensearch-knn zip from Maven
and then unzipping it and adding it to the classpath.

Added a unit test to confirm that it works.

Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15 jmazanec15 requested a review from a team October 4, 2022 17:21
dependencies {
api "org.opensearch:opensearch:${opensearch_version}"
zipArchive group: 'org.opensearch.plugin', name:'opensearch-knn', version: "${opensearch_build}"
api fileTree(dir: knnJarDirectory, include: '*.jar')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we are taking it as a proper dependency?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why can't we just pull from maven central?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get the zip from maven, but we need to add the jar to the class path. To do this, we must unzip the zip file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we try using api thing. It will add in the class path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I tried. api wont unzip the zipfile. We have to do it on our own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this is different from getting the dependency of ML commons or lets say any other dependency like apache commons in the project? They all get downloaded from maven.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maven does not necessarily have to be jars. Following opensearch-project/opensearch-build#1916, k-NN published the zip using the custom pluginzip function (see k-NN's build.gradle).

ml-commons a different method to just publish the jar to maven. k-NN could also do this, but does not right now. Hence, we have to rely on the zip.

Copy link
Collaborator

@ylwu-amzn ylwu-amzn Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmazanec15 , seems it's reasonable to publish just shared part as jar to maven like ml-commons, so client plugin like neural search doesn't need to add the whole zip as dependency. But that takes time. I think it's fine for current phase to rely on zip. I don't see other risks, but we'd better confirm there is no big concern from others like asking infra team?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ylwu-amzn I think this is good for now. I dont see any risks. Integ tests install ml-commons, k-NN and neural-search in same cluster and there are not any issues, so I think it will be okay. What are the other risks involved?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets go ahead and create a issue on k-NN for this. Approving the PR meanwhile.

Comment on lines +135 to +136
zipArchive group: 'org.opensearch.plugin', name:'opensearch-knn', version: "${opensearch_build}"
api fileTree(dir: knnJarDirectory, include: '*.jar')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zipArchive group: 'org.opensearch.plugin', name:'opensearch-knn', version: "${opensearch_build}"
api fileTree(dir: knnJarDirectory, include: '*.jar')
api group: 'org.opensearch.plugin', name:'opensearch-knn', version: "${opensearch_build}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to above comment

@jmazanec15 jmazanec15 merged commit 34f358d into opensearch-project:main Oct 6, 2022
@dblock
Copy link
Member

dblock commented Oct 19, 2022

k-nn should be publishing its JAR to Maven if something is going to be taking a dependency on it - we have all the mechanisms to do so. Is there a GH issue for that? @jmazanec15

zane-neo added a commit to zane-neo/neural-search that referenced this pull request Oct 20, 2022
# This is the 1st commit message:

Add text embedding processor to neural search

Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#2 will be skipped:

# Code format
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#3 will be skipped:

# Address review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#4 will be skipped:

# Add blocking text embedding method for pipeline processor
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#5 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#6 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#7 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#8 will be skipped:

# Fix naming convention and IT function move to base
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#9 will be skipped:

# Fix naming convention and IT function move to base
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#10 will be skipped:

# Update src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
#
# Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>

# The commit message opensearch-project#11 will be skipped:

# Update src/main/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessor.java
#
# Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>

# The commit message opensearch-project#12 will be skipped:

# Fix code review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#13 will be skipped:

# Fix text embedding processor NPE
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message opensearch-project#14 will be skipped:

# Remove jackson dependencies and fix tests with XCoontent
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>
zane-neo added a commit that referenced this pull request Oct 20, 2022
* # This is a combination of 14 commits.
# This is the 1st commit message:

Add text embedding processor to neural search

Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #2 will be skipped:

# Code format
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #3 will be skipped:

# Address review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #4 will be skipped:

# Add blocking text embedding method for pipeline processor
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #5 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #6 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #7 will be skipped:

# Add BaseNeuralSearchIT and address other review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #8 will be skipped:

# Fix naming convention and IT function move to base
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #9 will be skipped:

# Fix naming convention and IT function move to base
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #10 will be skipped:

# Update src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java
#
# Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>

# The commit message #11 will be skipped:

# Update src/main/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessor.java
#
# Co-authored-by: Navneet Verma <vermanavneet003@gmail.com>

# The commit message #12 will be skipped:

# Fix code review comments
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #13 will be skipped:

# Fix text embedding processor NPE
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

# The commit message #14 will be skipped:

# Remove jackson dependencies and fix tests with XCoontent
#
# Signed-off-by: Zan Niu <zaniu@amazon.com>

* Add text embedding processor to neural search

Signed-off-by: Zan Niu <zaniu@amazon.com>

* Remove unnecessary parameters in TextEmbeddingProcessor method

Signed-off-by: Zan Niu <zaniu@amazon.com>

* Remove unnecessary empty string checks

Signed-off-by: Zan Niu <zaniu@amazon.com>

* Add field max depth limit to prevent malicious attack

Signed-off-by: Zan Niu <zaniu@amazon.com>

Signed-off-by: Zan Niu <zaniu@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants