Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for cidr function #706

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

salyh
Copy link
Contributor

@salyh salyh commented Sep 26, 2024

Description

Implement cidr function to check wether an ip address is with a given cidr

Issues Resolved

#671

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@salyh
Copy link
Contributor Author

salyh commented Sep 26, 2024

@YANG-DB @vamshin @anirudha

I pushed the scaffold for the CIDR function.
The real implementation which validates the ip address and check if its within the cidr is missing because we need to discuss the introduction of a new 3rd party library:

I suggest to add a dependency to https://github.com/seancfoley/IPAddress because this library seems to provide ip validation and in range checking for ipv4 and ipv6 addresses/cidrs. The library is Apache licensed.

Other options are the following libraries or code snippets (or of course manual implementation with java standard library which is error prone in my view):

Let me know what you think and what the general process is to discuss new dependencies

@YANG-DB
Copy link
Member

YANG-DB commented Sep 26, 2024

Hi @salyh
I agree - let's use https://github.com/seancfoley/IPAddress library since it's in use in our SQL repo already

Comment on lines 14 to 19
private UnresolvedExpression ipAddress;
private UnresolvedExpression cidrBlock;

public Cidr(UnresolvedExpression ipAddress, UnresolvedExpression cidrBlock) {
this.ipAddress = ipAddress;
this.cidrBlock = cidrBlock;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge the latest code base and refactor this class with Lombok annotations. Ref 38ca314

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@LantaoJin
Copy link
Member

LantaoJin commented Sep 27, 2024

Hi @salyh
Can you check if the Guava library can meet our needs?

@salyh
Copy link
Contributor Author

salyh commented Sep 27, 2024

Hi @salyh Can you check if the Guava library can meet our needs?

I checked but it does not support cidr checks. https://github.com/seancfoley/IPAddress is also already used in the sql project.

@salyh
Copy link
Contributor Author

salyh commented Sep 27, 2024

Hi @salyh I agree - let's use https://github.com/seancfoley/IPAddress library since it's in use in our SQL repo already

done

@YANG-DB
Copy link
Member

YANG-DB commented Sep 30, 2024

@salyh plz resolve conflict
thanks

@salyh
Copy link
Contributor Author

salyh commented Sep 30, 2024

@salyh plz resolve conflict thanks

done

@salyh
Copy link
Contributor Author

salyh commented Sep 30, 2024

@YANG-DB @vamshin @anirudha

Regarding validation of the CIDR function parameters:

The IPAddress library is very lenient regarding validation of ip addresss and cidr blocks.

I suggest we at least disallow partial ip address segements and empty or null adresses:

  IPAddressStringParameters valOptions = new IPAddressStringParameters.Builder()
                .allowEmpty(false)
                .setEmptyAsLoopback(false)
                .allow_inet_aton(false)
                .allowSingleSegment(false)
                .toParams();

Then the remaining question is wether we should enforce other syntax checks like which the library is not enforcing:

  • check that cidr block is a really a in cicdr notation (containing a '/')
  • check that ip address and cidr are either both ipv6 or both ipv4
  • define the behaviour for edge cases like 0.0.0.0 (or just rely on the api)
  • define handling of special ip adresses like multicast, anycast, 255...* etc

My suggestion is to only

  • disallow partial ip address segements and empty or null adresses (like already implemented) and leave everything else
  • disallow mixed ipv4/ipv6 adresses and cidr blocks

and leave everything else to the library.

@YANG-DB
Copy link
Member

YANG-DB commented Oct 7, 2024

@salyh
please add the needed documentation in the following locations:

@YANG-DB
Copy link
Member

YANG-DB commented Oct 7, 2024

@YANG-DB @vamshin @anirudha

Regarding validation of the CIDR function parameters:

The IPAddress library is very lenient regarding validation of ip addresss and cidr blocks.

I suggest we at least disallow partial ip address segements and empty or null adresses:

  IPAddressStringParameters valOptions = new IPAddressStringParameters.Builder()
                .allowEmpty(false)
                .setEmptyAsLoopback(false)
                .allow_inet_aton(false)
                .allowSingleSegment(false)
                .toParams();

Then the remaining question is wether we should enforce other syntax checks like which the library is not enforcing:

  • check that cidr block is a really a in cicdr notation (containing a '/')
  • check that ip address and cidr are either both ipv6 or both ipv4
  • define the behaviour for edge cases like 0.0.0.0 (or just rely on the api)
  • define handling of special ip adresses like multicast, anycast, 255...* etc

My suggestion is to only

  • disallow partial ip address segements and empty or null adresses (like already implemented) and leave everything else
  • disallow mixed ipv4/ipv6 adresses and cidr blocks

and leave everything else to the library.

@salyh
Thanks for this review - I support your suggestion

  • disallow partial ip address segements and empty or null adresses (like already implemented) and leave everything else
  • disallow mixed ipv4/ipv6 adresses and cidr blocks

In addition plz document the additional validations as future work:

check that cidr block is a really a in cicdr notation (containing a '/')
check that ip address and cidr are either both ipv6 or both ipv4
define the behaviour for edge cases like 0.0.0.0 (or just rely on the api)
define handling of special ip adresses like multicast, anycast, 255...* etc

Thanks

Copy link
Member

@YANG-DB YANG-DB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override
public Boolean apply(String ipAddress, String cidrBlock) {

IPAddressString parsedIpAddress = new IPAddressString(ipAddress, valOptions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we leverage InetAddresses in opensearch-common package to parse IpAddress and CIDR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything wrong with it. It is a well-maintained package

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not concerning the package is well-maintained or not.
My point is to be consistent with OpenSearch's parsing and validation behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo Can you provide a link to code or docs where we can see the OpenSearch's parsing and validation behavior.?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/opensearch-project/OpenSearch/blob/main/libs/common/src/main/java/org/opensearch/common/network/InetAddresses.java#L358

    public static InetAddress forString(String ipString) {
        byte[] addr = ipStringToBytes(ipString);

        // The argument was malformed, i.e. not an IP string literal.
        if (addr == null) {
            throw new IllegalArgumentException(String.format(Locale.ROOT, "'%s' is not an IP string literal.", ipString));
        }

        return bytesToInetAddress(addr);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo Thanks. But please see #706 (comment)

IPAddressString parsedCidrBlock = new IPAddressString(cidrBlock, valOptions);

try {
parsedCidrBlock.validate();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we align the validation with OpenSearch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dr-lilienthal I agree with @penghuo
lets consolidate for using the same functionality
thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read this and the above thread correct the idea is to remove the https://github.com/seancfoley/IPAddress lib dependency and instead use classes of the org.opensearch.common.network, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactally.

Copy link
Contributor Author

@salyh salyh Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo @YANG-DB Got it. But I dont see any methods to check if the ipaddress is within the cidr block or not. Did I overlooked something or should we implement it ourself? See the discussions above: 1 and 2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo please assist...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salyh for cidr match, I want to confirm if the behavior of the term query is as expected. If it is, I can help locate the relevant code.

GET my-index-000001/_search
{
  "query": {
    "term": {
      "ip_addr": "192.168.0.0/16"
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo yes it is. Is the relevant code this?: https://github.com/opensearch-project/OpenSearch/blob/119abaff95a7f54affea6c844fad7ce3c8360155/server/src/main/java/org/opensearch/index/mapper/IpFieldMapper.java#L248

If so I try to re-implement based on org.apache.lucene.document.InetAddressPoint.newPrefixQuery() which evalutes to something like

byte[] lower = value.getAddress();
    byte[] upper = value.getAddress();
    for (int i = prefixLength; i < 8 * lower.length; i++) {
      int m = 1 << (7 - (i & 7));
      lower[i >> 3] &= ~m;
      upper[i >> 3] |= m;
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IpField term query is rewritten as PointRangeQuery in lucene. I can think 2 approach

  1. read value from child operator, index into Lucene, and run range query.
  2. leverage parseCidr logic, and perform match query using https://github.com/seancfoley/IPAddress.

@YANG-DB
Copy link
Member

YANG-DB commented Oct 11, 2024

@dr-lilienthal can u plz resolve the conflicts and also signOff the commits ?( DCO failure )
thanks

salyh and others added 7 commits October 25, 2024 10:49
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
…h exception runtime exception in case of a mixed adress type

Signed-off-by: Jens Schmidt <jens.schmidt@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Jens Schmidt <jens.schmidt@eliatra.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants