-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub-issue#253 : Implemented GeoIP processor functionality #2925
GitHub-issue#253 : Implemented GeoIP processor functionality #2925
Conversation
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
private final Counter geoIpProcessingMismatchCounter; | ||
private final GeoIPProcessorConfig geoIPProcessorConfig; | ||
private final String tempPath; | ||
private static final String TAGS_ON_MATCH_FAILURE = "tags_on_match_failure"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be "tags_on_geoip_source_not_found". We should not use same tag for all processor. You should also have this in the GeoIP config. The YAML config option can be "tags_on_source_not_found" but the actual tag should have "geoip" in it like I suggested above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added List tagsOnSourceNotFoundFailure in YAML configuration and necessary changes in respective classes and Readme.MD
geoData = new GetGeoIP2Data(finalPath.concat(flipDatabase), geoIPProcessorConfig.getServiceType().getMaxMindService().getCacheSize(), geoIPProcessorConfig); | ||
} | ||
} | ||
downloadReady = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should do this before it is used in the "synchronized" block, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downloadReady should be false until download successful. The flag will become true once download successful in downloadThroughURLandS3()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downloadReady
is not initialized, right? Is it not possible while (!downloadReady())
executed before it is initialized to false?
toggle = !toggle; | ||
} else { | ||
flipDatabase = DATABASE_2; | ||
toggle = !toggle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since togge = !toggle
is done in both cases, if can be done outside of if-else block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the line "togge = !toggle" above if and else block.
return null; | ||
public Map<String, Object> getGeoData(InetAddress inetAddress, List<String> attributes, String tempDestDir) { | ||
Map<String, Object> geoData = new HashMap<>(); | ||
if (GeoIPProcessorService.downloadReady) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? You are using a public static variable directly here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the downloadReady should be static to access in without creating instance. Checking this condition because enrichment should not happen until mmdb download completes.
return null; | ||
public Map<String, Object> getGeoData(InetAddress inetAddress, List<String> attributes, String tempDestDir) { | ||
Map<String, Object> geoData = new HashMap<>(); | ||
if (GeoIPProcessorService.downloadReady) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here. I don't think this is correct approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the downloadReady should be static to access in without creating instance. Checking this condition because enrichment should not happen until mmdb download completes.
@@ -31,7 +38,39 @@ public static boolean ipValidationcheck(String input) throws UnknownHostExceptio | |||
* @throws UnknownHostException UnknownHostException | |||
*/ | |||
public static boolean PrivateIPVersionCheck(String ipAddress) throws UnknownHostException { | |||
//TODO : Checking for the IP is valid or not if IP is not a private IPV4 and IPV6 | |||
InetAddress address = InetAddress.getByName(ipAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? Can we not use ``InetAddress.isSiteLocalAddress()`?
See http://docs.oracle.com/javase/1.5.0/docs/api/java/net/InetAddress.html#isSiteLocalAddress().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code with InetAddress.isSiteLocalAddress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! A few comments below:
...oip-processor/src/main/java/org/opensearch/dataprepper/plugins/processor/GeoIPProcessor.java
Outdated
Show resolved
Hide resolved
...oip-processor/src/main/java/org/opensearch/dataprepper/plugins/processor/GeoIPProcessor.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/opensearch/dataprepper/plugins/processor/databaseenrich/GetGeoIP2Data.java
Outdated
Show resolved
Hide resolved
Collection<Record<Event>> records = geoIPProcessor.doExecute(setEventQueue()); | ||
for (final Record<Event> record : records) { | ||
Event event = record.getData(); | ||
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to test that events have been enriched with Geo data here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking for assertions like the following added here. This is to test that the prepared Geo data actually gets into the event:
assertThat(event.get("location/country_name", String.class), equalTo("United States"));
assertThat(event.get("location/timezone", String.class), equalTo("America/Chicago"));
... ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case is to read the IP from the input JSON file based on the key present in YAML configuration. For example the path for reading the IP is peer/ip. Please find the below sample json input:
"peer" : {
"ip" : "1.2.3.4"
"host" : "example.org"
}
"status" : "success"
There is no enrichment in this test testcase to add assertions for country_name and timezone.
Enrichment related are tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing other comments! Looks pretty good other than this one:
There is no enrichment in this test testcase.
The test case is testing GeoIPProcessor's doExecute
method, which enriches the events with GeoIP data. And we want to verify that the events are actually enriched. This is a standard test for processors, this one is an example.
Something like this should work:
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205")); | |
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205")); | |
assertThat(event.get("location/country_name", String.class), equalTo("United States")); | |
assertThat(event.get("location/timezone", String.class), equalTo("America/Chicago")); | |
assertThat(event.get("location/country_IsoCode", String.class), equalTo("US")); | |
assertThat(event.get("location/continent_name", String.class), equalTo("North America")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doExecute() is common method for all the functionality to read IP from input json, calling enrichment method and updated records with enriched data.
The Enrichment related are tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that doExecute()
indeed includes call to getGeoData()
, which have been fully tested and are being mocked in this doExecuteTest()
. But as you said, there's more than just getGeoData()
in doExecute()
, which is why we need to assert the update event here. Specifically, this line that adds geo data to the event:
eventRecord.getData().put(key.getKeyConfig().getTarget(), geoData);
Imagine this line is missing, doExecuteTest()
should catch it. But in its current form, it won't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving so we can move forward.
...processor/src/test/java/org/opensearch/dataprepper/plugins/processor/GeoIPProcessorTest.java
Show resolved
Hide resolved
...r-plugins/geoip-processor/src/test/resources/mmdb-file/geo-enterprise/GeoIP2-Enterprise.mmdb
Outdated
Show resolved
Hide resolved
...cessor/src/main/java/org/opensearch/dataprepper/plugins/processor/GeoIPProcessorService.java
Outdated
Show resolved
Hide resolved
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
1cfe596
to
51722ba
Compare
…nkataraopasyavula/data-prepper into geoip-processor-functionality # Conflicts: # data-prepper-plugins/geoip-processor/build.gradle
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Approving and merging to make progress. I would still want you to address the comment |
* GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Fixed the test-case-failed issue. Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> --------- Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com> Co-authored-by: Deepak Sahu <deepak.sahu562@gmail.com> Signed-off-by: George Chen <qchea@amazon.com>
…rch-project#2925) * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Fixed the test-case-failed issue. Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> * GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com> --------- Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com> Co-authored-by: Deepak Sahu <deepak.sahu562@gmail.com> Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Description
Implemented GeoIP processor functionality
Issues Resolved
GitHub-issue #253
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.