Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][io] ElasticSearch sink: align null fields behaviour #18577

Merged
merged 3 commits into from
Nov 23, 2022

Conversation

nicoloboschi
Copy link
Contributor

@nicoloboschi nicoloboschi commented Nov 23, 2022

Motivation

In the ES Sink we use two different clients based on the target instance of elastic. With ES 7 the opensearch high level client is used and against ES 8 the newest Java client is used.

With stripNulls=false, when sending json with null fields the first one correctly send the json keeping the field equals to null. The Java client instead drop the null fields during the payload serialization.

This is useful for end users because Elastic let you to configure null_field for fields in order to be able to query the index checking for nulls.
In any case, the usage of the client should be transparent to the user and it must behaves in the same way.

Modifications

Note that the decision to keep or strip null fields is up to the sink itself with the option stripNulls. Therefore the client must be configured to always keep nulls.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

codecov-commenter commented Nov 23, 2022

Codecov Report

Merging #18577 (84e53e2) into master (cd85a67) will decrease coverage by 0.34%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #18577      +/-   ##
============================================
- Coverage     47.39%   47.04%   -0.35%     
+ Complexity    10479    10415      -64     
============================================
  Files           698      698              
  Lines         68070    68070              
  Branches       7279     7279              
============================================
- Hits          32264    32026     -238     
- Misses        32228    32474     +246     
+ Partials       3578     3570       -8     
Flag Coverage Δ
unittests 47.04% <ø> (-0.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...lsar/broker/loadbalance/impl/ThresholdShedder.java 3.26% <0.00%> (-39.87%) ⬇️
.../apache/pulsar/broker/loadbalance/LoadManager.java 61.11% <0.00%> (-16.67%) ⬇️
.../apache/pulsar/broker/admin/impl/PackagesBase.java 54.12% <0.00%> (-13.77%) ⬇️
...tent/NonPersistentDispatcherMultipleConsumers.java 40.74% <0.00%> (-12.35%) ⬇️
...apache/pulsar/broker/service/TopicListService.java 42.62% <0.00%> (-12.30%) ⬇️
.../pulsar/broker/service/SharedConsumerAssignor.java 68.51% <0.00%> (-9.26%) ⬇️
...org/apache/pulsar/broker/loadbalance/LoadData.java 58.33% <0.00%> (-8.34%) ⬇️
...pulsar/broker/service/PulsarCommandSenderImpl.java 72.30% <0.00%> (-6.16%) ⬇️
...he/pulsar/client/impl/PartitionedProducerImpl.java 30.34% <0.00%> (-5.13%) ⬇️
...roker/loadbalance/impl/ModularLoadManagerImpl.java 65.55% <0.00%> (-3.35%) ⬇️
... and 37 more

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One comment inline.

@@ -39,10 +39,10 @@
public abstract class ElasticSearchTestBase {

public static final String ELASTICSEARCH_8 = Optional.ofNullable(System.getenv("ELASTICSEARCH_IMAGE_V8"))
.orElse("docker.elastic.co/elasticsearch/elasticsearch:8.1.0");
.orElse("docker.elastic.co/elasticsearch/elasticsearch:8.5.1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8.5.2 in pom.xml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the server. https://hub.docker.com/_/elasticsearch/tags
8.5.1 is the latest 8.x

@liangyepianzhou
Copy link
Contributor

@nicoloboschi Could you please help open a PR to cherry-pick this to branch-2.10 (To avoid including more bugs)?

@liangyepianzhou
Copy link
Contributor

I will move this to 2.10.4. If you have any questions, feel free to ping me.

@Technoboy- Technoboy- added this to the 3.0.0 milestone Feb 8, 2023
@dave2wave
Copy link
Member

@liangyepianzhou @Technoboy @nicoloboschi - Can you please tell me why this was NOT cherrypicked to 2.10? We are seeing errors in our ESSink integration tests on the 2.10 branch and it looks like this could be the cause.

@liangyepianzhou
Copy link
Contributor

@dave2wave There is not ElasticSearchJavaRestClient.java class in 2.10 branch, so the modification for ElasticSearchJavaRestClient.java class should not be cherry-picked to 2.10 branch, right?

@nicoloboschi nicoloboschi deleted the es-null-values branch June 23, 2023 13:17
@nicoloboschi
Copy link
Contributor Author

@dave2wave There is not ElasticSearchJavaRestClient.java class in 2.10 branch, so the modification for ElasticSearchJavaRestClient.java class should not be cherry-picked to 2.10 branch, right?

Exactly. Branch 2.10 doesn't contain the new java client introduced in #14805

@dave2wave
Copy link
Member

@nicoloboschi Should we cherry-pick the PR #14805 and associated? Or just say that ES8 is not supported in 2.10, but is in 2.9, 2.11, and 3.0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants