Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub-issue#253 : Implemented GeoIP processor functionality #2925

Conversation

venkataraopasyavula
Copy link
Contributor

Description

Implemented GeoIP processor functionality

Issues Resolved

GitHub-issue #253

Check List

  • New functionality includes testing.
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
private final Counter geoIpProcessingMismatchCounter;
private final GeoIPProcessorConfig geoIPProcessorConfig;
private final String tempPath;
private static final String TAGS_ON_MATCH_FAILURE = "tags_on_match_failure";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be "tags_on_geoip_source_not_found". We should not use same tag for all processor. You should also have this in the GeoIP config. The YAML config option can be "tags_on_source_not_found" but the actual tag should have "geoip" in it like I suggested above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added List tagsOnSourceNotFoundFailure in YAML configuration and necessary changes in respective classes and Readme.MD

geoData = new GetGeoIP2Data(finalPath.concat(flipDatabase), geoIPProcessorConfig.getServiceType().getMaxMindService().getCacheSize(), geoIPProcessorConfig);
}
}
downloadReady = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should do this before it is used in the "synchronized" block, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downloadReady should be false until download successful. The flag will become true once download successful in downloadThroughURLandS3()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downloadReady is not initialized, right? Is it not possible while (!downloadReady()) executed before it is initialized to false?

toggle = !toggle;
} else {
flipDatabase = DATABASE_2;
toggle = !toggle;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since togge = !toggle is done in both cases, if can be done outside of if-else block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the line "togge = !toggle" above if and else block.

return null;
public Map<String, Object> getGeoData(InetAddress inetAddress, List<String> attributes, String tempDestDir) {
Map<String, Object> geoData = new HashMap<>();
if (GeoIPProcessorService.downloadReady) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? You are using a public static variable directly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the downloadReady should be static to access in without creating instance. Checking this condition because enrichment should not happen until mmdb download completes.

return null;
public Map<String, Object> getGeoData(InetAddress inetAddress, List<String> attributes, String tempDestDir) {
Map<String, Object> geoData = new HashMap<>();
if (GeoIPProcessorService.downloadReady) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here. I don't think this is correct approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the downloadReady should be static to access in without creating instance. Checking this condition because enrichment should not happen until mmdb download completes.

@@ -31,7 +38,39 @@ public static boolean ipValidationcheck(String input) throws UnknownHostExceptio
* @throws UnknownHostException UnknownHostException
*/
public static boolean PrivateIPVersionCheck(String ipAddress) throws UnknownHostException {
//TODO : Checking for the IP is valid or not if IP is not a private IPV4 and IPV6
InetAddress address = InetAddress.getByName(ipAddress);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? Can we not use ``InetAddress.isSiteLocalAddress()`?

See http://docs.oracle.com/javase/1.5.0/docs/api/java/net/InetAddress.html#isSiteLocalAddress().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code with InetAddress.isSiteLocalAddress.

Copy link
Collaborator

@oeyh oeyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! A few comments below:

data-prepper-plugins/geoip-processor/README.md Outdated Show resolved Hide resolved
data-prepper-plugins/geoip-processor/README.md Outdated Show resolved Hide resolved
Collection<Record<Event>> records = geoIPProcessor.doExecute(setEventQueue());
for (final Record<Event> record : records) {
Event event = record.getData();
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to test that events have been enriched with Geo data here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes().

Copy link
Collaborator

@oeyh oeyh Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking for assertions like the following added here. This is to test that the prepared Geo data actually gets into the event:

assertThat(event.get("location/country_name", String.class), equalTo("United States"));
assertThat(event.get("location/timezone", String.class), equalTo("America/Chicago"));
... ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case is to read the IP from the input JSON file based on the key present in YAML configuration. For example the path for reading the IP is peer/ip. Please find the below sample json input:

"peer" : {
"ip" : "1.2.3.4"
"host" : "example.org"
}
"status" : "success"

There is no enrichment in this test testcase to add assertions for country_name and timezone.
Enrichment related are tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes()

Copy link
Collaborator

@oeyh oeyh Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing other comments! Looks pretty good other than this one:

There is no enrichment in this test testcase.

The test case is testing GeoIPProcessor's doExecute method, which enriches the events with GeoIP data. And we want to verify that the events are actually enriched. This is a standard test for processors, this one is an example.

Something like this should work:

Suggested change
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205"));
assertThat(event.get("/peer/ip", String.class), equalTo("136.226.242.205"));
assertThat(event.get("location/country_name", String.class), equalTo("United States"));
assertThat(event.get("location/timezone", String.class), equalTo("America/Chicago"));
assertThat(event.get("location/country_IsoCode", String.class), equalTo("US"));
assertThat(event.get("location/continent_name", String.class), equalTo("North America"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doExecute() is common method for all the functionality to read IP from input json, calling enrichment method and updated records with enriched data.
The Enrichment related are tested in GetGeoLite2DataTest file in two methods.
1.getGeoDataTest_without_attributes()
2.getGeoDataTest_with_attributes()

Copy link
Collaborator

@oeyh oeyh Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that doExecute() indeed includes call to getGeoData(), which have been fully tested and are being mocked in this doExecuteTest(). But as you said, there's more than just getGeoData() in doExecute(), which is why we need to assert the update event here. Specifically, this line that adds geo data to the event:

eventRecord.getData().put(key.getKeyConfig().getTarget(), geoData);

Imagine this line is missing, doExecuteTest() should catch it. But in its current form, it won't.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so we can move forward.

venkataraopasyavula and others added 3 commits June 26, 2023 21:44
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
@venkataraopasyavula venkataraopasyavula force-pushed the geoip-processor-functionality branch 2 times, most recently from 1cfe596 to 51722ba Compare June 27, 2023 14:43
…nkataraopasyavula/data-prepper into geoip-processor-functionality

# Conflicts:
#	data-prepper-plugins/geoip-processor/build.gradle
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
venkataraopasyavula and others added 7 commits June 27, 2023 22:06
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
oeyh
oeyh previously approved these changes Jul 10, 2023
…ed Code review comments

Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>
@kkondaka
Copy link
Collaborator

Approving and merging to make progress. I would still want you to address the comment downloadReady flag and also global usage of this flag.

@kkondaka kkondaka merged commit 38c6843 into opensearch-project:main Jul 12, 2023
24 checks passed
chenqi0805 pushed a commit that referenced this pull request Jul 19, 2023
* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Fixed the test-case-failed issue.

Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

---------

Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>
Co-authored-by: Deepak Sahu <deepak.sahu562@gmail.com>
Signed-off-by: George Chen <qchea@amazon.com>
MaGonzalMayedo pushed a commit to MaGonzalMayedo/data-prepper that referenced this pull request Jul 25, 2023
…rch-project#2925)

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Fixed the test-case-failed issue.

Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

* GitHub-issue#253 : Implemented GeoIP processor functionality. Addressed Code review comments
Signed-off-by: venkataraopasyavula <venkataraopasyavula@gmail.com>

---------

Signed-off-by: Deepak Sahu <deepak.sahu562@gmail.com>
Co-authored-by: Deepak Sahu <deepak.sahu562@gmail.com>
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants