-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple edges with same edgeid using VertexCentric indices after modifications to MULTI edge #301
Comments
Any updates here, we are suffering from this issue in our Live environment. |
Any update, on this? |
@pankajydv — could you please try to see if it's possible to reproduce the same issue with JanusGraph 0.2.0 and/or |
@mbrukman - thanks for the update, let's test this on the latest JanusGraph codebase and we'll let you know the results. |
Same issue is reproducible on JanusGraph 0.2.0 with Apache Cassandra 3.0.9 as well. The properties corresponding to are following: storage.backend=cassandra
storage.cassandra.replication-strategy-class=org.apache.cassandra.locator.NetworkTopologyStrategy
storage.cassandra.replication-strategy-options= dc1, 2
storage.hostname=xxx.xxx.xxx.xxx
storage.cassandra.keyspace=janustest
storage.cassandra.frame-size-mb=300
storage.read-only=false
query.force-index=true |
@mbrukman - one of my colleague tried this code with JanusGraph 0.2 and Apache Cassandra 3.0.9 and we are still seeing this issue - request you to please prioritize this as this sometimes affect our live production system and is tedious to fix. In our observation what goes wrong is 2 or more copies get indexed in the VertexCentric index having same edge id (if you query in such a way that this VertexCentric index is not used then only 1 edge is found) and starts giving confusing results. This is even after we have set ConsistencyModifier to LOCK for this edge, and it's not possible to change the ConsistencyModifer of the VertexCentric index. |
@rahul38888 please attach the code and config you used to reproduce this issue here in zip format. |
This part has been removed because of the long text content. Same code can be found here |
First, this isn't a working example. It would be great if you had something where one could just do |
We have created a working maven test to reproduce the issue. Please find the project attached with this. Change the graph.properties in classpath accordingly. |
@robertdale We have updated our code in Junit Testcase format, so now the issue is reproducible via gremlin> edge=mgmt.getRelationType('hie_child')
gremlin> index=mgmt.getRelationIndex(edge, 'HSCCERDecr')
gremlin> mgmt.setConsistency(index, ConsistencyModifier.LOCK)
Cannot change consistency of schema element: HSCCERDecr
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.IllegalArgumentException: Cannot change consistency of schema element: HSCCERDecr
at org.janusgraph.graphdb.database.management.ManagementSystem.setConsistency(ManagementSystem.java:1115) |
Isn’t consistency part immutable? I think you need to set the consistency when you create the index. |
Thanks @amcp . I tried to set consistency after building vertex-centric index, as following RelationTypeIndex hie_childIndex = mgmt.buildEdgeIndex(label, "HSCCERDecr", Direction.BOTH, Order.decr,
mgmt.getPropertyKey("hostid_e"),
mgmt.getPropertyKey("status_e"),
mgmt.getPropertyKey("cmstype_e"),
mgmt.getPropertyKey("hrank"));
mgmt.setConsistency(hie_childIndex, ConsistencyModifier.LOCK); This throws following exception:
Also, there seems no implementation of JanusGraphManagement#buildEdgeIndex which would take consistency as an argument. Its seems this is not allowed even when the index is being created. Please guide me if I am missing something. |
The re-runnable code to reproduce the issue is attached below: |
Hi @rahul38888 I'd like to help reproduce the issue. Would it be possible for you to put the code from your zip into a fresh github repository with the steps required to reproduce on a fresh system included in the readme? |
@chupman I have uploaded this test-case to a fresh git repository. Please let us know if you need more information. |
Thanks @pankajydv, I was able to recreate the issue with the repo. I was trying the same thing with the zip yesterday and it was failing before it got to the use case. I forked and uploaded the output xml since 2400 lines seemed a bit long for a Gist. |
@chupman - are you still working on this. Can you please share, if there is an update. |
It doesn't look like the test code ( |
We tried to figure out the root cause of this problem by looking at the code of tag v0.2.0 and we see that the code in
We modified the code of the acquireLock method to return true even in case of 'MULTI' edges if the edge consistency level is set to LOCK and could resolve this issue. Can anyone please let's know if this can lead to any discrepancy. Please note we have already tested MULTIPLE edges creation between a pair of vertices and it works fine with this change. |
Looking at the code we can conclude that an edge with a multiplicity of MULTI will never be able to acquire LOCK. Can someone please throw some light on why is an edge with multiplicity MULTI is not supposed to acquire lock and if it is then how can we make the edge modifications acquire the LOCK. Please note edge consistency is already set to LOCK. |
@pankajydv Do you have a public fork with your changes? It would be easier for understanding what you are proposing, and ultimately could lead to a pull request from you. |
HI @pankajydv, I'm curious why you're locking the edge label instead of the property. Based off the docs it seems like it's not recommended to rely on locking this way with Cassandra. Would using the consistency modifier FORK rather than LOCK work for your use case? |
|
Hello, any new on this ? is the only way to handle concurrent update on an edge is to get duplicate edges and filter them out later ? |
Any updates on this? It's an important issue with data consistency I think. |
The issue reproduces randomly on my production environment (on Titan), and reproduces on my dev environment pretty easily even with the ConsistencyModifier of the edge set to LOCK.
Following is the code, I used to reproduce the issue on my local environment on JanusGraph 0.1.1, Backed by Apache Cassandra 2.1.7
The problem is If I try to concurrently modify an edge with n threads, I get n copies of the same edge, with same edgeid and then gremlin queries start behaving very abruptly. Like if the direct children on some vertex are m, and if I fire a query which uses the VertexCentricIndex the number of children returned are m+n (n is the number of threads I used for concurrent modifications). I used the following code to reproduce the issue.
Following is the behavior, I observed through gremlin console:
Before the concurrent modification, number of edges with or without the VertexCentric index are same.
After the concurrent modification with 10 threads. I see 10 edges between the parent/child with VertexCentric index and only 1 wihtout the index.
If I try to drop only a single edge, it seems as if all the edges are dropped.
But n-1 edges reapper through the VertexCentric index after the transaction is committed, but without using the index there is no edge.
Please note within the transaction after deleting the affected edge, it seems all the edges are deleted, but after transaction commit, n-1 edges re-appear using the VertexCentric index query, but there is no edge if that VertexCentric Index is not used.
The text was updated successfully, but these errors were encountered: