-
Notifications
You must be signed in to change notification settings - Fork 495
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
) (#4183) * [NOID] Fixes #3426: add apache arrow import procedure to extended (#3978) * [NOID] java 11 changes * [NOID] try removing gradle deps * [NOID] 4.4 changes * [NOID] spotless and licence changes * [NOID] fix tests * [NOID] format changes
- Loading branch information
Showing
8 changed files
with
687 additions
and
0 deletions.
There are no files selected for viewing
118 changes: 118 additions & 0 deletions
118
docs/asciidoc/modules/ROOT/pages/overview/apoc.import/apoc.import.arrow.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
= apoc.import.arrow | ||
:description: This section contains reference documentation for the apoc.import.arrow procedure. | ||
|
||
label:procedure[] label:apoc-extended[] | ||
|
||
[.emphasis] | ||
apoc.import.arrow(input, $config) - Imports arrow from the provided arrow file or byte array | ||
|
||
== Signature | ||
|
||
[source] | ||
---- | ||
apoc.import.arrow(urlOrBinaryFile :: ANY?, config = {} :: MAP?) :: (file :: STRING?, source :: STRING?, format :: STRING?, nodes :: INTEGER?, relationships :: INTEGER?, properties :: INTEGER?, time :: INTEGER?, rows :: INTEGER?, batchSize :: INTEGER?, batches :: INTEGER?, done :: BOOLEAN?, data :: STRING?) | ||
---- | ||
|
||
== Input parameters | ||
[.procedures, opts=header] | ||
|=== | ||
| Name | Type | Default | ||
|urlOrBinaryFile|ANY?|null | ||
|config|MAP?|{} | ||
|=== | ||
|
||
== Config parameters | ||
This procedure supports the following config parameters: | ||
|
||
.Config parameters | ||
[opts=header, cols='1a,1a,1a,3a'] | ||
|=== | ||
| name | type |default | description | ||
| unwindBatchSize | Integer | `2000` | the batch size of the unwind | ||
| mapping | Map | `{}` | see `Mapping config` example below | ||
|=== | ||
|
||
== Output parameters | ||
[.procedures, opts=header] | ||
|=== | ||
| Name | Type | ||
|file|STRING? | ||
|source|STRING? | ||
|format|STRING? | ||
|nodes|INTEGER? | ||
|relationships|INTEGER? | ||
|properties|INTEGER? | ||
|time|INTEGER? | ||
|rows|INTEGER? | ||
|batchSize|INTEGER? | ||
|batches|INTEGER? | ||
|done|BOOLEAN? | ||
|data|STRING? | ||
|=== | ||
|
||
[[usage-apoc.import.arrow]] | ||
== Usage Examples | ||
|
||
The `apoc.import.arrow` procedure can be used to import arrow files created by the `apoc.export.arrow.*` procedures. | ||
|
||
|
||
[source,cypher] | ||
---- | ||
CALL apoc.import.arrow("fileCreatedViaExportProcedures.arrow") | ||
---- | ||
|
||
.Results | ||
[opts=header] | ||
|=== | ||
| file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data | ||
| "fileCreatedViaExportProcedures.arrow" | "file" | "arrow" | 3 | 1 | 15 | 105 | 4 | -1 | 0 | TRUE | NULL | ||
|=== | ||
|
||
|
||
We can also import a file from a binary `byte[]` created by the `apoc.export.arrow.stream.*` procedures. | ||
|
||
[source,cypher] | ||
---- | ||
CALL apoc.import.arrow(`<binaryArrow>`) | ||
---- | ||
|
||
=== Mapping config | ||
|
||
In order to import complex types not supported by Parquet, like Point, Duration, List of Duration, etc.. | ||
we can use the mapping config to convert to the desired data type. | ||
For example, if we have a node `(:MyLabel {durationProp: duration('P5M1.5D')}`, and we export it in a parquet file/binary, | ||
we can import it by explicit a map with key the property key, and value the property type. | ||
|
||
That is in this example, by using the load procedure: | ||
[source,cypher] | ||
---- | ||
CALL apoc.load.arrow(fileOrBinary, {mapping: {durationProp: 'Duration'}}) | ||
---- | ||
|
||
Or with the import procedure: | ||
[source,cypher] | ||
---- | ||
CALL apoc.import.parquet(fileOrBinary, {mapping: {durationProp: 'Duration'}}) | ||
---- | ||
|
||
The mapping value types can be one of the following: | ||
|
||
* `Point` | ||
* `LocalDateTime` | ||
* `LocalTime` | ||
* `DateTime` | ||
* `Time` | ||
* `Date` | ||
* `Duration` | ||
* `Char` | ||
* `Byte` | ||
* `Double` | ||
* `Float` | ||
* `Short` | ||
* `Int` | ||
* `Long` | ||
* `Node` | ||
* `Relationship` | ||
* `BaseType` followed by Array, to map a list of values, where BaseType can be one of the previous type, for example `DurationArray` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.