This repository was archived by the owner on Jul 14, 2022. It is now read-only.

24 Feb 15:50

voutilad

v4.1

47ab60e

v4.1 - Fix KHop bug Latest

Latest

Bug fix for v4.

Fixes a bug in the SubGraphRecord implementation where field names weren't being matched properly in a switch block due to comparing against different string cases.

Assets 4

24 Jan 15:06

voutilad

34d6f07

v4 - Bulk Database Imports

✨ New Stuff!

Bulk import jobs (import.bulk) that support bootstrapping a new database on a Neo4j host by streaming nodes and relationships from a neo4j-arrow client. See the example notebook for how it works.
New info jobs (info.server and info.jobs) for querying the server-side plugin version and currently tracked jobs. (See the ServerInfoHandler class.)
The Python client/wrapper (neo4j_arrow.py) now has type annotations and passes MyPy in strict mode!

⚙️ Changes in the Guts

Redesigned the Producer parts handling write streams (where the client pushes data to the server). Should solve some minor bugs in existing GDS Write Jobs.
Lots of runtime type inspection in the Python client.
Squashed some sequencing/race condition bugs in some of the GDS Write Jobs (previously wasn't properly advancing the status of the job).

🔨 Major Breaking Changes

Python wrapper code is shuffled around and now in ./python. (Will attach versions to GH releases to make it easier.)
Job names and parameters have been standardized using snake case for parameters (e.g. idField => id_field) and lowercase, dot notation for job names cypherRead => cypher.read.

Assets 4

12 Nov 18:44

voutilad

v3.1

3f676e4

v3.1 - Plug some Memory Leaks

Some fixes to critical reliability issues with the v3 release:

Setup the VectorSchemaRoot to use the memory allocator used by the flushing task
Close the VectorSchemaRoot before closing the allocators
Add in some delays in the busy loop when attempting to allocate memory (in WorkBuffer.init())...we were failing too fast.

This version should be used in lieu of v3.

Assets 3

12 Nov 00:46

voutilad

bb08021

v3 - 2-hops and New Plumbing

🧪 Experimental k-hop (for k=2) implementation...see KHOP.md for details
👨‍🔧 Major replumbing of the Producer code for reading streams, removing semaphores and lots of lock contention points. Still WIP, but showing promise at increasing performance of all read-related jobs.
👟 Snuck in some special "extra" parameters that can be passed in GDS Read actions to tweak partition count, batch size, and list length parameters (for khop) on a per-job basis.

Next up: more performance tuning! 🏎️

Assets 3

25 Oct 20:41

voutilad

b29cfee

v2 - TLS Support & GDS Write Improvements

New Features

TLS support (not yet supporting mutual TLS) for both client and server. A full-chain certificate and private key can be provided to the server via new ARROW_TLS_CERTIFICATE and ARROW_TLS_PRIVATE_KEY env vars. The Python neo4j_arrow.py client has been updated to allow enabling TLS and also disabling certificate validation when needed.

Improvements & Fixes

Easier to use Arrow memory settings, supporting suffixes (e.g. g, m, t) like when setting JVM heap size. For instance: MAX_MEM_GLOBAL=52g
Longer default timeouts for write jobs
Fixed memory leak when writing GDS graphs...now they clean up properly when using call gds.graph.drop() or when shutting down the server.
Support passing native PyArrow Table instances when putting a stream via the neo4j_arrow client

Known Issues & Future Work

No ability to write relationship properties
Cypher support needs some more love
Error handling of jobs could use improvements
GDS Writes of relationships end up using inefficient Java types for adjacency lists, etc.
GDS Write jobs could be improved by removing synchronous step of fully collecting the stream before processing it

Assets 3

15 Oct 21:04

voutilad

2d3223b

v1 - The Line in the Sand Pre-release

Pre-release

Figured need to start "tagging" something to have a referenceable build I've personally tested.

At this point, the following should be working:

reading nodes and their labels and properties
reading relationships and their types and properties
writing nodes with labels and properties (those supported by GDS)
writing relationships and types (no properties, yet!)

There are definite perf bottlenecks in some post-processing after doing writes as well as some timing issues in the write jobs.

For instance, if you want to build a graph you need to do the following:

Write the nodes, supplying a new graph name (it will be created)
Wait until you see on the server side (via logs) that it's complete as the client will report success after the data transfers. (I need a status indicator somewhere.)
Then write the relationships.
Same as with nodes, keep an eye on the server and see when it completes.
The graph should be available for use now.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

✨ New Stuff!

⚙️ Changes in the Guts

🔨 Major Breaking Changes

Uh oh!

Uh oh!

Uh oh!

New Features

Improvements & Fixes

Known Issues & Future Work

Uh oh!

Uh oh!

Releases: neo4j-field/neo4j-arrow

v4.1 - Fix KHop bug

Uh oh!

v4 - Bulk Database Imports

✨ New Stuff!

⚙️ Changes in the Guts

🔨 Major Breaking Changes

Uh oh!

v3.1 - Plug some Memory Leaks

Uh oh!

v3 - 2-hops and New Plumbing

Uh oh!

v2 - TLS Support & GDS Write Improvements

New Features

Improvements & Fixes

Known Issues & Future Work

Uh oh!

v1 - The Line in the Sand

Uh oh!