Consistency model / ACID #2913
-
Hey @linas et al. — I'm wondering what the consistency model of an AtomSpace is (specifically, the ACI in ACID). I see mutexes used (IIRC, a per-atom mutex, but I may have seen a per-AtomSpace mutex floating around), and I believe I read somewhere about a QueryLink being executed in a "thread-safe" way. But does this mean e.g. QueryLinks are executed atomically, with all consequent atoms either being added or not added? Does it further mean that an AtomSpace is strictly serializable (that would be good news!) or merely guarantees a read-committed isolation level? I'd love AtomSpace to pass the Jepsen test! This brings me to the durability question. It seems that RocksDB is the preferred persistence backend for AtomSpace, which is fine. But what about failure conditions? If a write to the AtomSpace succeeds, might a write to a connected persistence backend fail? (I can imagine Client A writing to an AtomSpace, but persistence failing and connection to OpenCog server terminating; then client B reading from the AtomSpace an unpersisted value.) If so, this introduces a consistency issue. If the AtomSpace process crashes, or its underlying hardware fails, what happens? I assume there's no notion of a writeahead log, but I may be wrong. Just trying to plan for eventualities. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Hi Alex,
I cannot give you a clear, easy answer, because the AtomSpace doesn't fall into either the conventional relationalDB model, nor into the key-value model (BASE). The following might illuminate the situation:
This seems to be about distributed datastores. The "raw" atomspace is not distributed, so this does not apply. However... There is an API called StorageNode It allows any given AtomSpace to connect to ... storage! Currently, this is RocksDB, Postgres, flat files, or other remote AtomSpaces on the network. What is saved/restored, or exchanged on the network, is up to the user. The network storage node is just a peer-to-peer API: what Atoms the peers trade with one-another is up to them. I don't know what the Jepsen test implies for this. If a remote peer crashes ... well, whatever Atoms you are holding are what they are. They're not damaged. p.s. creating new StorageNodes to other systems is really pretty easy. It'll take more than a day or two, but should take less than a week or two. So if you have some favorite system .. have at it! It's not hard.
This uses the StorageNode mechanism, so the only things saved there are what you explicitly write out. It's designed that way, because saving everything automatically, all the time would be a huge performance bottleneck, esp. for rapidly changing data. (If you're clever you can write a thread to do this, if you feel you really need to. But if you really need to, you are probably mis-designing your system.) If you For me, the atomspace never crashes: I run multi-month-long processing jobs. What I do get are thunderstorms that knock out power, and the solution for that is UPS power supplies. (I mean, sure, you can write code that crashes, while it has the atomspace open. I think its unlikely to corrupt your data, but its kind of on you if you are working regularly in that scenario. The AtomSpace was not designed for banking applications.) |
Beta Was this translation helpful? Give feedback.
-
I just converted this from a github "issue" to a github "discussion". I've never used discussions before. Hope this works. |
Beta Was this translation helpful? Give feedback.
-
Appreciate your detailed response, @linas! Sounds like it's a linearizable consistency model, with thread-safety on a per-atom basis (with the caveat of the possible case of the partially-written atom), but not across atoms. I'd imagine that if for whatever reason a BindLink fails mid-execution due to e.g. a machine shutdown on AWS, the state of the AtomSpace will reflect whatever atoms have been written to that point, but this may be inconsistent with respect to the overall system and will require manual repairing. |
Beta Was this translation helpful? Give feedback.
Hi Alex,
I cannot give you a clear, easy answer, because the AtomSpace doesn't fall into either the conventional relationalDB model, nor into the key-value model (BASE). The following might illuminate the situation: