Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.7: issues with create edge or create class on distributed configuration #2389

Closed
tanji opened this issue May 26, 2014 · 13 comments
Closed
Assignees
Milestone

Comments

@tanji
Copy link

tanji commented May 26, 2014

Hello,

on latest 1.7, CREATE EDGE on Graph schema causes a timeout and several warnings. Even though creation succeeds in the end it will timeout first (10 seconds).

Server version: OrientDB Server v1.7-SNAPSHOT (build UNKNOWN@r; 2014-05-26 13:56:37-0400)
Ubuntu 14.04 64-bit, with OpenJDK 7
Default Hazelcast configuration, two nodes

Simple test case following.

orientdb {test}> create class Person extends V

Class created successfully. Total classes in database now: 13

orientdb {test}> create class Loves extends E

Class created successfully. Total classes in database now: 14

orientdb {test}> create vertex Person set name = 'xiaodu'

Created vertex 'Person#11:0{name:xiaodu} v1' in 0.029000 sec(s).

orientdb {test}> create vertex Person set name = 'xiaohu'

Created vertex 'Person#11:1{name:xiaohu} v1' in 0.026000 sec(s).

orientdb {test}> create edge Loves from #11:0 to #11:1 set created = sysdate()

[orientdb1] error on reading distributed request: command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate())
Cannot dispatch response to the thread queue orientdb1
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
-> java.lang.Thread.run(Thread.java:744)
java.lang.NullPointerException
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
-> java.lang.Thread.run(Thread.java:744)
null
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
-> java.lang.Thread.run(Thread.java:744)
2014-05-26 23:50:36:124 WARN [orientdb1] timeout (10000ms) on waiting for synchronous responses from nodes=[orientdb2, orientdb1] responsesSoFar=[] request=id=26 from=orientdb1 task=command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate()) [OHazelcastDistributedDatabase]
2014-05-26 23:50:36:127 WARN [orientdb1] detected 2 node(s) in timeout or in conflict and quorum (2) has not been reached, rolling back changes for request: id=26 from=orientdb1 task=command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate()) [ODistributedResponseManager]Cannot route COMMAND operation to the distributed node
Error on sending distributed request against database 'test.[]' to nodes [orientdb2, orientdb1]
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:192)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:361)
-> com.orientechnologies.orient.server.distributed.ODistributedStorage.command(ODistributedStorage.java:182)
-> com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:59)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.command(ONetworkProtocolBinary.java:1151)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:342)
-> com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:169)
-> com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
Quorum 2 not reached for request=id=26 from=orientdb1 task=command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate()). No server in conflict. Received: {orientdb2=waiting-for-response, orientdb1=waiting-for-response}
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:192)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:361)
-> com.orientechnologies.orient.server.distributed.ODistributedStorage.command(ODistributedStorage.java:182)
-> com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:59)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.command(ONetworkProtocolBinary.java:1151)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:342)
-> com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:169)
-> com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
Error: com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 2 not reached for request=id=26 from=orientdb1 task=command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate()). No server in conflict. Received: {orientdb2=waiting-for-response, orientdb1=waiting-for-response}

On server 2:

[orientdb2]<-[orientdb1] error on reading distributed request: command_sql(create edge Loves from #11:0 to #11:1 set created = sysdate())
Cannot dispatch response to the thread queue orientdb1
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
-> java.lang.Thread.run(Thread.java:744)
java.lang.NullPointerException
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
-> java.lang.Thread.run(Thread.java:744)
null
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
@tanji tanji changed the title latest 1.7-SNAPSHOT: issues with create edge or create class on distributed configuration 1.7: issues with create edge or create class on distributed configuration May 27, 2014
@enisher enisher added this to the 1.7.1 milestone May 27, 2014
@tanji
Copy link
Author

tanji commented May 28, 2014

Hello, can we have an approximate ETA for the fix of this error? We plan to roll out an application based on OrientDB in production in about one week but this error makes OrientDB Graph Database unusable in distributed mode since there's a huge timeout at each named Edge creation.

Thanks in advance.

@lvca
Copy link
Member

lvca commented May 28, 2014

The line you report the NPE is:

com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:500)

But I see this line, where it's impossible iRequest is null:

throw new ODistributedException("Cannot dispatch response to the thread queue " + iRequest.getSenderNodeName(), e);

So, please can you repeat the test with last 1.7.1-SNAPSHOT?

@lvca lvca self-assigned this May 28, 2014
@tanji
Copy link
Author

tanji commented May 28, 2014

Hi Luca,

where can I find 1.7.1-SNAPSHOT?
I have been unable to find it on sonatype.

Thanks

@enisher
Copy link
Contributor

enisher commented May 28, 2014

Hi Tanji,

Sorry, we have not reconfigured a build server to new branching model yet.
It is not pushed to sonatype yet.

To get 1.7.1-SNAPSHOT you can get a 1.7.1 branch with git and execute `ant
installg'

Best regards,
Artem Orobets

  • Orient Technologiesthe Company behind OrientDB*

2014-05-28 13:26 GMT+03:00 tanji notifications@github.com:

Hi Luca,

where can I find 1.7.1-SNAPSHOT?
I have been unable to find it on sonatype.

Thanks


Reply to this email directly or view it on GitHubhttps://github.com//issues/2389#issuecomment-44388293
.

@lvca
Copy link
Member

lvca commented May 28, 2014

Waiting for Jenkins I've deployed it manually, so now it's available.

@tanji
Copy link
Author

tanji commented May 28, 2014

Thanks guys. Anyway I've tested with the latest 1.7.1 branch and the error is similar. I don't know why the exception gives out a NPE itself, anyway the problem seems located elsewhere. This is quite a blocker because distributed mode is totally broken when using graph schemas.

Note that creating edges without properties (aka lightweight edges) doesn't break the server.

@lvca
Copy link
Member

lvca commented May 28, 2014

Without the stack trace we can't help you. Please can you provide it?

@tanji
Copy link
Author

tanji commented May 28, 2014

I think the issue is easy enough to reproduce on your own. Just start 2 servers in distributed mode, create edge will property and you will get the error.

I have supplied the full stack trace output above in the initial issue.
Let me know if there's more that I can do.

@tanji
Copy link
Author

tanji commented May 30, 2014

Hello, have you been able to reproduce the issue?
Thanks in advance

@lvca lvca modified the milestones: 1.7.2, 1.7.1, 1.7.3 Jun 4, 2014
@andrii0lomakin andrii0lomakin modified the milestones: 1.7.4, 1.7.3 Jun 12, 2014
@lvca lvca modified the milestones: 1.7.5, 1.7.4 Jun 23, 2014
@lvca lvca modified the milestones: 1.7.6, 1.7.5, 1.7.7 Jul 10, 2014
@lvca
Copy link
Member

lvca commented Jul 22, 2014

@tanji We fixed similar issues. Please can you retry with 1.7.7-SNAPSHOT?

@lvca
Copy link
Member

lvca commented Jul 23, 2014

I'm closing it, in case the error exists with 1.7.7, please comment this to reopen.

@lvca lvca closed this as completed Jul 23, 2014
@lvca lvca added 6 - Done and removed 2 - Sprint labels Jul 23, 2014
@tanji
Copy link
Author

tanji commented Jul 23, 2014

@lvca There's no more NPE, but the edge is not correctly created: it is not referenced correctly in the Vertex.

----+-----+--------+------------+-----------
#   |@RID |name    |out_distance|in_distance
----+-----+--------+------------+-----------
0   |#16:0|milan   |null        |null       
1   |#16:1|new york|null        |null       
----+-----+--------+------------+-----------

Should be

----+-----+--------+------------+-----------
#   |@RID |name    |out_distance|in_distance
----+-----+--------+------------+-----------
0   |#12:0|milan   |#13:0       |null       
1   |#12:1|new york|null        |#13:0      
----+-----+--------+------------+-----------

Please reopen the issue.

@lvca lvca reopened this Jul 23, 2014
@lvca lvca modified the milestones: 1.7.8, 1.7.7 Jul 23, 2014
@srgg
Copy link

srgg commented Jul 24, 2014

@tanji @lvca
here is the same issue #2582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants