Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Longer Traversals On JanusGraph Seem To Be Slower Since Titan #514

Open
fppt opened this issue Sep 13, 2017 · 3 comments
Open

Longer Traversals On JanusGraph Seem To Be Slower Since Titan #514

fppt opened this issue Sep 13, 2017 · 3 comments

Comments

@fppt
Copy link

fppt commented Sep 13, 2017

Hi All,

We have recently realised that our longer traversals have become slower since switching from Titan to JanusGraph and we are trying to debug why. To do this we have been comparing the traversal plans with the following test code:

GraphTraversal<Vertex, Vertex> traversal = graph.traversal().V().out().as("a").out().select("a").in().in().in();
        traversal.asAdmin().applyStrategies();
        System.out.println(traversal);

With JanusGraph this traversal gives us:

[JanusGraphStep(vertex,[]), JanusGraphVertexStep(OUT,vertex), NoOpBarrierStep(2500)@[a], JanusGraphVertexStep(OUT,vertex), SelectOneStep(a), NoOpBarrierStep(2500), JanusGraphVertexStep(IN,vertex), NoOpBarrierStep(2500), JanusGraphVertexStep(IN,vertex), NoOpBarrierStep(2500), JanusGraphVertexStep(IN,vertex)]

And with titan you get:

[TitanGraphStep([],vertex), TitanVertexStep(OUT,vertex)@[a], TitanVertexStep(OUT,vertex), SelectOneStep(a), TitanVertexStep(IN,vertex), TitanVertexStep(IN,vertex), TitanVertexStep(IN,vertex)]

The latest Tinkerpop/JanusGraph seems to have added a lot of BarrierSteps and we think this seems to be causing our performance troubles.

Can anyone enlighten us on why these BarrierSteps were introduced and if maybe there is a way around them?

@pluradj
Copy link
Member

pluradj commented Sep 13, 2017

Running explain() on the traversal shows that it is the PathRetractionStrategy which is inserting those. Then searching on gremlin-users, this post from Marko talks about how @twilmes developed the strategy, which was introduced in TinkerPop 3.2.1. Ultimately, the NoOpBarrierStep is part of the bulking concept: The Beauty of Bulking and The Beauty of Bulking -- Redux'prise.

You could certainly remove a traversal strategy if there are defaults you don't want to use.

@fppt
Copy link
Author

fppt commented Sep 13, 2017

Hey @pluradj. Ya we ended up removing PathRetractionStrategy and LazyBarrierStrategy to get our old performance back.

I will have to look into those links and learn more as I kind of see the benefit when doing large traversals but I don't see the benefit when doing long traversals which only require one result to be returned. For example in our case we often limit the results to just a few results and in that case the NoOpBarrierStep seems to be a hindrance

@fppt
Copy link
Author

fppt commented Sep 15, 2017

So just for people's information the new PathRetractionStrategy and LazyBarrierStrategy actually scale very nicely. The problem is if you do something like the following in a very large graph:

g.V().out().out().out().out().out().out().out().out().out().out().out().out().out().out().out().limit(2);

In other words when you perform a large traversal but only to find a limited number of results the performance is poor. With this type of traversal and the PathRetractionStrategy enabled you are going to wait a long time for very few results.

I think this is because the limit(2) step is not propagating into all the NoOpBarrierSteps.

@pluradj pluradj changed the title Longer Traversals On Janus Seem To Be Slower Since Titan Longer Traversals On JanusGraph Seem To Be Slower Since Titan Oct 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants