Skip to content

Conversation

@kim
Copy link
Contributor

@kim kim commented Sep 13, 2023

Description of Changes

Adds a new system table st_module which stores the hash of the stdb module currently associated with the database. The hash is updated transactionally when the database is initialized or updated (that is, the table does not have any rows until the init procedure ran successfully).

st_module also associates an "epoch" number with the row, which is used to store a fencing token. As sketched out in the changes to standalone, this allows to use an external (distributed) locking service to safely order lifecycle operations on the database. We can thus guarantee at-most-once semantics for init and update reducers.

API and ABI

  • This is a breaking change to the module ABI
  • This is a breaking change to the module API
  • This is a breaking change to the ClientAPI
  • This is a breaking change to the SDK API
  • This is a breaking change to the database internals

Because a new system table is introduced, compatibility with existing databases cannot be maintained. The patch is against cloud-next, which contains breaking changes anyways.

@kim kim changed the base branch from master to kim/cloud-next September 13, 2023 10:37
@mamcx
Copy link
Contributor

mamcx commented Sep 15, 2023

I recommend you wait for my PR that changes and simplifies the use of system tables.

@kim
Copy link
Contributor Author

kim commented Sep 16, 2023

I’d love to, but this one is also fairly critical for cloud to work correctly. If you haven’t considered already, I would also greatly appreciate if new system tables could be introduced after the db was bootstrapped (I.e. a gap in the reserved sequence range).

@kim kim changed the title [WIP] Store the current module hash in a system table Store the current module hash in a system table Sep 18, 2023
@kim kim marked this pull request as ready for review September 18, 2023 15:05
@kim kim changed the base branch from kim/cloud-next to master September 20, 2023 05:41
kim added 9 commits September 20, 2023 08:29
Fill the user-retrievable database log with more info about what is
going on while a database is being initialized or updated.
Add a u64 field `epoch` to the st_module table, used to store a fencing
token.
Need another line of logs, as there is now more output.
Make getting and setting the program hash trait methods. Also widen the
epoch / fencing token to fit stdb sequences.
///
/// A programmable datastore is one which has a program of some kind associated
/// with it.
pub trait Programmable: TxDatastore {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming?

kim added 3 commits September 20, 2023 10:08
This allows to actually use relational db methods taking those types as
arguments outside the core crate.

Also make `next_sequence` and `create_sequence` not take `&mut self`
unnecessarily.
Comment on lines +522 to +528
/// Update the [`Hash`] of the program currently associated with the
/// datastore.
///
/// The operation runs within the transactional context `tx`. The fencing
/// token `fence` must be verified to be greater than in any previous
/// invocations of this method.
fn set_program_hash(&self, tx: &mut Self::MutTxId, fence: Self::FencingToken, hash: Hash) -> Result<()>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cause the program itself to be updated, or only the stored hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the hash — but inside the transaction which updates the module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point, actually -- the datastore doesn't store its own module. That does make sense in a distributed setting, but is another piece to worry about for keeping a replica consistent. Food for thought...

Comment on lines +557 to +560
// Release tx due to locking semantics and acquire a control db
// lock instead.
stdb.commit_tx(tx)?;
let lock = self.lock_database_instance_for_update(instance.id)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a race condition...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but we can’t thread through the tx. If a later thread is faster acquiring the lock, this update will not have any effect — that is, we guarantee at most once, not exactly once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. Not ideal, but probably fine.

@kim kim merged commit 0005096 into master Sep 28, 2023
@kim kim deleted the kim/db-init branch September 28, 2023 07:46
@kim kim mentioned this pull request Sep 29, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants