Skip to content

Improve custom analyzer config #706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 13, 2022

Conversation

adamretter
Copy link
Contributor

@adamretter adamretter commented Nov 17, 2021

Counterpart to eXist-db/exist#4082.

Copy link
Contributor

@duncdrum duncdrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs some explication. the difference between String and String[ ] is far from self-evident.

the prose mentions javadocs but lacks a link, its also ambiguous if refers to the javadoc of Lucene 4 which might not be easy to find anymore, or exist-db 5.4.0 which similarly are hard to find in a publicly available location.

also please resolve the conflict, needs a rebase

@adamretter adamretter force-pushed the improve-custom-analyzer-config branch from 851bb04 to eba9f14 Compare November 17, 2021 21:48
@line-o
Copy link
Member

line-o commented Nov 17, 2021

I would like to see an example for at least one of the new types added to the documentation.
As you just added this you will likely have one at hand.

@adamretter
Copy link
Contributor Author

adamretter commented Nov 18, 2021

@line-o Here you go:

<analyzer class="com.evolvedbinary.lucene.analyzer.PLMultiAnalyzer">
  <param name="minimumTermLength" type="int" value="2" />
  <param name="punctuationDictionary" type="java.lang.char[]">
    <value>'</value>
    <value>-</value>
    <value>’</value>
  </param>
<analyzer>

@line-o
Copy link
Member

line-o commented Nov 18, 2021

@adamretter org.apache.lucene.analysis.util.CharArraySet is already supported.

@line-o
Copy link
Member

line-o commented Nov 18, 2021

@adamretter My original request was also to add the example as a listing to the documentation for others to be able to see how to use that feature.

@adamretter
Copy link
Contributor Author

Sorry that wasn't clear at all @line-o . Also that analyser above is private so I don't think it makes sense as an example in the docs. We don't have a public analyser that takes such arguments.

@line-o
Copy link
Member

line-o commented Nov 18, 2021

To describe an interface you do not need to publish the implementation. Additional argument types are useless until it is documented how to declare them in the index configuration.

I see you edited your example to now pass in java.lang.char[] (edit: char is literal type)
. So that would be really helpful to have in there like a so:

<analyzer class="my.PunctuationAnalyzer">
  <param name="stringArray" type="char[]">
    <value>'</value>
    <value>-</value>
    <value>’</value>
  </param>
<analyzer>

Copy link
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct the types in the keyword list.

@adamretter adamretter force-pushed the improve-custom-analyzer-config branch from 487fd72 to e631e1c Compare November 18, 2021 16:30
@joewiz
Copy link
Member

joewiz commented Dec 10, 2021

@duncdrum Are there still outstanding changes you're requesting? If not, could you mark this as approved?

@line-o line-o requested a review from duncdrum December 10, 2021 15:33
Copy link
Member

@dizzzz dizzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamretter
Copy link
Contributor Author

@joewiz @dizzzz Do you think we can get this merged now?

@joewiz
Copy link
Member

joewiz commented Jan 13, 2022

According to the commit messages @adamretter addressed @duncdrum's concerns, and @duncdrum has had over a month to raise objections, so I will override the block and merge.

@joewiz joewiz merged commit 2441265 into eXist-db:master Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants