|  | 
|  | 1 | +--- | 
|  | 2 | +layout: default | 
|  | 3 | +title: IP2Geo | 
|  | 4 | +parent: Ingest processors  | 
|  | 5 | +grand_parent: Ingest APIs | 
|  | 6 | +nav_order: 130 | 
|  | 7 | +--- | 
|  | 8 | + | 
|  | 9 | +# IP2Geo | 
|  | 10 | +Introduced 2.10 | 
|  | 11 | +{: .label .label-purple } | 
|  | 12 | + | 
|  | 13 | +The `ip2geo` processor adds information about the geographical location of an IPv4 or IPv6 address. The `ip2geo` processor uses IP geolocation (GeoIP) data from an external endpoint and therefore requires an additional component, `datasource`, that defines from where to download GeoIP data and how frequently to update the data. | 
|  | 14 | + | 
|  | 15 | +{::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/icons/info-icon.png" class="inline-icon" alt="info icon"/>{:/} **NOTE**<br>The `ip2geo` processor maintains the GeoIP data mapping in system indexes. The GeoIP mapping is retrieved from these indexes during data ingestion to perform the IP-to-geolocation conversion on the incoming data. For optimal performance, it is preferable to have a node with both ingest and data roles, as this configuration avoids internode calls reducing latency. Also, as the `ip2geo` processor searches GeoIP mapping data from the indexes, search performance is impacted. | 
|  | 16 | +{: .note} | 
|  | 17 | + | 
|  | 18 | +## Getting started | 
|  | 19 | + | 
|  | 20 | +To get started with the `ip2geo` processor, the `opensearch-geospatial` plugin must be installed. See [Installing plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/) to learn more. | 
|  | 21 | + | 
|  | 22 | +## Cluster settings | 
|  | 23 | + | 
|  | 24 | +The IP2Geo data source and `ip2geo` processor node settings are listed in the following table. | 
|  | 25 | + | 
|  | 26 | +| Key | Description | Default | | 
|  | 27 | +|--------------------|-------------|---------| | 
|  | 28 | +| plugins.geospatial.ip2geo.datasource.endpoint | Default endpoint for creating the data source API. | Defaults to https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json. | | 
|  | 29 | +| plugins.geospatial.ip2geo.datasource.update_interval_in_days | Default update interval for creating the data source API. | Defaults to 3. | | 
|  | 30 | +| plugins.geospatial.ip2geo.datasource.batch_size | Maximum number of documents to ingest in a bulk request during the IP2Geo data source creation process. | Defaults to 10,000. | | 
|  | 31 | +| plugins.geospatial.ip2geo.processor.cache_size | Maximum number of results that can be cached. There is only one cache used for all IP2Geo processors in each node | Defaults to 1,000. | | 
|  | 32 | +|-------------------|-------------|---------| | 
|  | 33 | + | 
|  | 34 | +## Creating the IP2Geo data source | 
|  | 35 | + | 
|  | 36 | +Before creating the pipeline that uses the `ip2geo` processor, create the IP2Geo data source. The data source defines the endpoint value that will download GeoIP data and specifies the update interval. | 
|  | 37 | + | 
|  | 38 | +OpenSearch provides the following endpoints for GeoLite2 City, GeoLite2 Country, and GeoLite2 ASN databases from [MaxMind](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data), which is shared under the [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) license: | 
|  | 39 | + | 
|  | 40 | +* GeoLite2 City: https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json | 
|  | 41 | +* GeoLite2 Country: https://geoip.maps.opensearch.org/v1/geolite2-country/manifest.json | 
|  | 42 | +* GeoLite2 ASN: https://geoip.maps.opensearch.org/v1/geolite2-asn/manifest.json | 
|  | 43 | + | 
|  | 44 | +If an OpenSearch cluster cannot update a data source from the endpoints within 30 days, the cluster does not add GeoIP data to the documents and instead adds `"error":"ip2geo_data_expired"`. | 
|  | 45 | + | 
|  | 46 | +### Data source options | 
|  | 47 | + | 
|  | 48 | +The following table lists the data source options for the `ip2geo` processor.    | 
|  | 49 | + | 
|  | 50 | +| Name | Required | Default | Description | | 
|  | 51 | +|------|----------|---------|-------------| | 
|  | 52 | +| `endpoint` | Optional | https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json | The endpoint that downloads the GeoIP data. | | 
|  | 53 | +| `update_interval_in_days` | Optional | 3 | How frequently, in days, the GeoIP data is updated. The minimum value is 1. | | 
|  | 54 | + | 
|  | 55 | +To create an IP2Geo data source, run the following query: | 
|  | 56 | + | 
|  | 57 | +```json | 
|  | 58 | +PUT /_plugins/geospatial/ip2geo/datasource/my-datasource | 
|  | 59 | +{ | 
|  | 60 | +    "endpoint" : "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json", | 
|  | 61 | +    "update_interval_in_days" : 3 | 
|  | 62 | +} | 
|  | 63 | +``` | 
|  | 64 | +{% include copy-curl.html %} | 
|  | 65 | + | 
|  | 66 | +A `true` response means that the request was successful and that the server was able to process the request. A `false` response indicates that you should check the request to make sure it is valid, check the URL to make sure it is correct, or try again. | 
|  | 67 | + | 
|  | 68 | +### Sending a GET request | 
|  | 69 | + | 
|  | 70 | +To get information about one or more IP2Geo data sources, send a GET request:   | 
|  | 71 | + | 
|  | 72 | +```json | 
|  | 73 | +GET /_plugins/geospatial/ip2geo/datasource/my-datasource | 
|  | 74 | +``` | 
|  | 75 | +{% include copy-curl.html %} | 
|  | 76 | + | 
|  | 77 | +You'll receive the following response: | 
|  | 78 | + | 
|  | 79 | +```json | 
|  | 80 | +{ | 
|  | 81 | +  "datasources": [ | 
|  | 82 | +    { | 
|  | 83 | +      "name": "my-datasource", | 
|  | 84 | +      "state": "AVAILABLE", | 
|  | 85 | +      "endpoint": "https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json", | 
|  | 86 | +      "update_interval_in_days": 3, | 
|  | 87 | +      "next_update_at_in_epoch_millis": 1685125612373, | 
|  | 88 | +      "database": { | 
|  | 89 | +        "provider": "maxmind", | 
|  | 90 | +        "sha256_hash": "0SmTZgtTRjWa5lXR+XFCqrZcT495jL5XUcJlpMj0uEA=", | 
|  | 91 | +        "updated_at_in_epoch_millis": 1684429230000, | 
|  | 92 | +        "valid_for_in_days": 30, | 
|  | 93 | +        "fields": [ | 
|  | 94 | +          "country_iso_code", | 
|  | 95 | +          "country_name", | 
|  | 96 | +          "continent_name", | 
|  | 97 | +          "region_iso_code", | 
|  | 98 | +          "region_name", | 
|  | 99 | +          "city_name", | 
|  | 100 | +          "time_zone", | 
|  | 101 | +          "location" | 
|  | 102 | +        ] | 
|  | 103 | +      }, | 
|  | 104 | +      "update_stats": { | 
|  | 105 | +        "last_succeeded_at_in_epoch_millis": 1684866730192, | 
|  | 106 | +        "last_processing_time_in_millis": 317640, | 
|  | 107 | +        "last_failed_at_in_epoch_millis": 1684866730492, | 
|  | 108 | +        "last_skipped_at_in_epoch_millis": 1684866730292 | 
|  | 109 | +      } | 
|  | 110 | +    } | 
|  | 111 | +  ] | 
|  | 112 | +} | 
|  | 113 | +``` | 
|  | 114 | + | 
|  | 115 | +### Updating an IP2Geo data source | 
|  | 116 | + | 
|  | 117 | +See the Creating the IP2Geo data source section for a list of endpoints and request field descriptions.  | 
|  | 118 | + | 
|  | 119 | +To update the date source, run the following query: | 
|  | 120 | + | 
|  | 121 | +```json | 
|  | 122 | +PUT /_plugins/geospatial/ip2geo/datasource/my-datasource/_settings | 
|  | 123 | +{ | 
|  | 124 | +    "endpoint": https://geoip.maps.opensearch.org/v1/geolite2-city/manifest.json, | 
|  | 125 | +    "update_interval_in_days": 10 | 
|  | 126 | +} | 
|  | 127 | +``` | 
|  | 128 | +{% include copy-curl.html %} | 
|  | 129 | + | 
|  | 130 | +### Deleting the IP2Geo data source | 
|  | 131 | + | 
|  | 132 | +To delete the IP2Geo data source, you must first delete all processors associated with the data source. Otherwise, the request fails.  | 
|  | 133 | + | 
|  | 134 | +To delete the data source, run the following query: | 
|  | 135 | + | 
|  | 136 | +```json | 
|  | 137 | +DELETE /_plugins/geospatial/ip2geo/datasource/my-datasource | 
|  | 138 | +``` | 
|  | 139 | +{% include copy-curl.html %} | 
|  | 140 | + | 
|  | 141 | +## Creating the pipeline | 
|  | 142 | + | 
|  | 143 | +Once the data source is created, you can create the pipeline. The following is the syntax for the `ip2geo` processor: | 
|  | 144 | + | 
|  | 145 | +```json  | 
|  | 146 | +{ | 
|  | 147 | +  "ip2geo": { | 
|  | 148 | +    "field":"ip", | 
|  | 149 | +    "datasource":"my-datasource" | 
|  | 150 | +  } | 
|  | 151 | +} | 
|  | 152 | +``` | 
|  | 153 | +{% include copy-curl.html %} | 
|  | 154 | + | 
|  | 155 | +### Configuration parameters | 
|  | 156 | + | 
|  | 157 | +The following table lists the required and optional parameters for the `ip2geo` processor. | 
|  | 158 | + | 
|  | 159 | +| Name | Required | Default | Description | | 
|  | 160 | +|------|----------|---------|-------------| | 
|  | 161 | +| `datasource` | Required | - | The data source name to use to retrieve geographical information. | | 
|  | 162 | +| `field` | Required | - | The field that contains the IP address for geographical lookup. | | 
|  | 163 | +| `ignore_missing` | Optional | false | If set to `true`, the processor does not modify the document if the field does not exist or is `null`. Default is `false`. | | 
|  | 164 | +| `properties` | Optional |  All fields in `datasource` | The field that controls which properties are added to `target_field` from `datasource`. | | 
|  | 165 | +| `target_field` | Optional | ip2geo | The field that contains the geographical information retrieved from the data source. | | 
|  | 166 | + | 
|  | 167 | +## Using the processor | 
|  | 168 | + | 
|  | 169 | +Follow these steps to use the processor in a pipeline. | 
|  | 170 | + | 
|  | 171 | +**Step 1: Create a pipeline.** | 
|  | 172 | + | 
|  | 173 | +The following query creates a pipeline, named `my-pipeline`, that converts the IP address to geographical information: | 
|  | 174 | + | 
|  | 175 | +```json | 
|  | 176 | +PUT /_ingest/pipeline/my-pipeline | 
|  | 177 | +{ | 
|  | 178 | +   "description":"convert ip to geo", | 
|  | 179 | +   "processors":[ | 
|  | 180 | +    { | 
|  | 181 | +        "ip2geo":{ | 
|  | 182 | +            "field":"ip", | 
|  | 183 | +            "datasource":"my-datasource" | 
|  | 184 | +        } | 
|  | 185 | +    } | 
|  | 186 | +   ]  | 
|  | 187 | +} | 
|  | 188 | +``` | 
|  | 189 | +{% include copy-curl.html %} | 
|  | 190 | + | 
|  | 191 | +**Step 2 (Optional): Test the pipeline.** | 
|  | 192 | + | 
|  | 193 | +{::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/icons/info-icon.png" class="inline-icon" alt="info icon"/>{:/} **NOTE**<br>It is recommended that you test your pipeline before you ingest documents. | 
|  | 194 | +{: .note} | 
|  | 195 | + | 
|  | 196 | +To test the pipeline, run the following query: | 
|  | 197 | + | 
|  | 198 | +```json | 
|  | 199 | +POST _ingest/pipeline/my-id/_simulate | 
|  | 200 | +{ | 
|  | 201 | +  "docs": [ | 
|  | 202 | +    { | 
|  | 203 | +      "_index":"my-index", | 
|  | 204 | +      "_id":"my-id", | 
|  | 205 | +      "_source":{ | 
|  | 206 | +        "my_ip_field":"172.0.0.1", | 
|  | 207 | +        "ip2geo":{ | 
|  | 208 | +         "continent_name":"North America", | 
|  | 209 | +         "region_iso_code":"AL", | 
|  | 210 | +         "city_name":"Calera", | 
|  | 211 | +         "country_iso_code":"US", | 
|  | 212 | +         "country_name":"United States", | 
|  | 213 | +         "region_name":"Alabama", | 
|  | 214 | +         "location":"33.1063,-86.7583", | 
|  | 215 | +         "time_zone":"America/Chicago" | 
|  | 216 | +         } | 
|  | 217 | +      } | 
|  | 218 | +    } | 
|  | 219 | +  ] | 
|  | 220 | +} | 
|  | 221 | +``` | 
|  | 222 | +{% include copy-curl.html %} | 
|  | 223 | + | 
|  | 224 | +**Step 3: Ingest a document.** | 
|  | 225 | + | 
|  | 226 | +The following query ingests a document into an index named `my-index`: | 
|  | 227 | + | 
|  | 228 | +```json | 
|  | 229 | +PUT /my-index/_doc/my-id?pipeline=ip2geo | 
|  | 230 | +{ | 
|  | 231 | +  "ip": "172.0.0.1" | 
|  | 232 | +} | 
|  | 233 | +``` | 
|  | 234 | +{% include copy-curl.html %} | 
|  | 235 | + | 
|  | 236 | +**Step 4 (Optional): Retrieve the document.**  | 
|  | 237 | + | 
|  | 238 | +To retrieve the document, run the following query: | 
|  | 239 | + | 
|  | 240 | +```json | 
|  | 241 | +GET /my-index/_doc/my-id | 
|  | 242 | +``` | 
|  | 243 | +{% include copy-curl.html %} | 
0 commit comments