Skip to content

ci: merge staging to master #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Aug 22, 2022
Merged

ci: merge staging to master #74

merged 18 commits into from
Aug 22, 2022

Conversation

maxwell-aiden
Copy link
Member

@maxwell-aiden maxwell-aiden commented Aug 19, 2022

This is an automatic PR generated by the pipeline CI/CD. This will be automatically fast-forward merged if successful.

emmacasolin and others added 8 commits August 18, 2022 11:10
- `@matrixai/db` 5.0.2 supports `ToString` lock requests
This was needed for the very specific case where `_open` was used to create a file during an existing transaction.
fix: `tests/inodes/INodeManager.test.ts` change to using `await expect().rejects.toThrow`
This just adds some tests that hit the API with a bunch of concurrent operations with random order. This is to check for any db transaction errors.
Updating js-db to 5.x.x (integrating RocksDB and DB-native concurrency control)
@maxwell-aiden maxwell-aiden self-assigned this Aug 19, 2022
@maxwell-aiden
Copy link
Member Author

@ghost
Copy link

ghost commented Aug 19, 2022

👇 Click on the image for a new way to code review
  • Make big changes easier — review code in small groups of related files

  • Know where to start — see the whole change at a glance

  • Take a code tour — explore the change with an interactive tour

  • Make comments and review — all fully sync’ed with github

    Try it now!

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map Legend

@CMCDragonkai
Copy link
Member

Benchmarking and docs update.

@CMCDragonkai
Copy link
Member

There is mac and windows failures to be addressed here.

@maxwell-aiden
Copy link
Member Author

@maxwell-aiden
Copy link
Member Author

Need to remember to use `utils.pathJoin` for EFS file paths and not `path.join` or `pathNode.join`.
@maxwell-aiden
Copy link
Member Author

@CMCDragonkai
Copy link
Member

So a possible problem is that await efs.stop() is calling await db.stop(), this could attempt to be rolling back transactions in the transaction references.

Now why would a transaction still exist? Especially one that is actually already committed.

It could be due to the write stream/read stream closing the fd, which triggers a transactional GC operation. This closing operation however is an orphaned/dangling promise. It's not awaited for because the write/read streams are callback oriented but also written before alot of change to the EFS.

So the transaction is probably being committed, and in the process of being destroyed. But at the same time a db.stop() is called which is iterating over the transaction references to attempt to roll them back.

There are several possible solutions here. The main idea is that db.stop() is attempting to rollback dangling transactions, but it was assumed that any transactions that exist are neither rollbacked nor committed.

A quick solution might be remove the transaction from the transaction references as soon as this._committed = true; or this._rollbacked = true; is set. This can then remove it from the references immediately.

This sort of means it's possible that the DB lifecycle is stopped BEFORE the tranaction is fully committed or rollbacked. I'm not entirely sure if this is an error.

Another alternative is to change how the db.stop is working to wait on the transaction completion. That is it can attempt to rollback, but check before if it is already in a "pending" state, and then skip the transaction, or just wait for its destruction. How to wait for its destruction? We would need to expose an ability to wait for the transaction to be destroyed. This could be done by making commit and rollback blocking with respect to the life cycle functions with @ready, and then just relying on an idempotent destroy.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 19, 2022

Let's try the second solution, change

  @ready(new errors.ErrorDBTransactionDestroyed())
  public async rollback(e?: Error): Promise<void> {

To:

  @ready(new errors.ErrorDBTransactionDestroyed(), true)
  public async rollback(e?: Error): Promise<void> {

And it should work fine...

And also remove it from the transaction references after this._committed = true and this._rollbacked = true;.

@CMCDragonkai
Copy link
Member

Do the same for commit.

@tegefaulkes
Copy link
Contributor

I think there's a problem with the write and read stream implementation that is part if the root cause here. In our implementation of the _destroy method we don't call the callback once this._fs.close has finished . This is possibly leading to close being an orphaned promise that is creating a transaction, If the stars align we end up with the transaction committed error when the EFS and DB are stopped.

It's annoying to test if any solution works since this is a hisenbug.

@CMCDragonkai
Copy link
Member

Actually it's a bit more complicated than this.

It's possible to have a transaction that is neither committed nor rollbacked but only because the transaction hasn't finished yet. It is truly a pending transaction.

In this situation, we end up calling rollback() which then causes the transaction to be rollbacking. Later the transaction may attempt to call commit, but end up failing because it's already rollbacked.

Now the DB itself is actually already stopped at this point, so this is a legitimate error.

Furthermore I cannot use ready blocking on commit and rollback because of the fact that this results in a promise deadlock due to the resource release.

@CMCDragonkai
Copy link
Member

I think there's a problem with the write and read stream implementation that is part if the root cause here. In our implementation of the _destroy method we don't call the callback once this._fs.close has finished . This is possibly leading to close being an orphaned promise that is creating a transaction, If the stars align we end up with the transaction committed error when the EFS and DB are stopped.

It's annoying to test if any solution works since this is a hisenbug.

It's possible fixing this might fix #72.

This is a possible bug leading to issues with #72?
@maxwell-aiden
Copy link
Member Author

@tegefaulkes
Copy link
Contributor

Looking at the streams API. It's more proper to await the close event for the stream to be fully ended with with all resources closed. Reviewing it's usage in tests we are waiting for the finish event for write streams and end events for read streams. both of these events are emitted before streams resources are fully cleaned up.

I'm going to fix this in all of the tests.

@maxwell-aiden
Copy link
Member Author

@tegefaulkes
Copy link
Contributor

One test failed on the mac tests. Thankfully no failures was due to the already committed error so I think that's resolved now.

Pushing up a quick fix for the failing test.

@maxwell-aiden
Copy link
Member Author

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 22, 2022

The benchmarks here actually don't bench the EFS at all. So you should remove the benches/db_* benches, and also remove it from results. We can keep the crypto ones for now until we abstract our crypto.

New EFS benchmarks can be written in a new PR. Can do this later.

@tegefaulkes
Copy link
Contributor

Looks like we're still getting a transaction already committed error but this one is different,

EncryptedFS Concurrency › concurrent file writes › EncryptedFS.ftruncate and EncryptedFS.writeFile, EncryptedFS.write and EncryptedFS.createWriteStream
    Transaction is already committed
      588 |       return;
      589 |     }
    > 590 |     const gc = await tran.get<null>([
          |                           ^
      591 |       ...this.gcDbPath,
      592 |       inodesUtils.iNodeId(ino as INodeIndex),
      593 |     ]);
      at node_modules/@matrixai/db/src/utils.ts:307:9
      at Object.transactionGet (node_modules/@matrixai/db/src/utils.ts:292:12)
      at constructor_.get (node_modules/@matrixai/db/src/DBTransaction.ts:263:29)
      at constructor_.get (node_modules/@matrixai/async-init/src/CreateDestroy.ts:111:20)
      at constructor_.get (src/inodes/INodeManager.ts:590:27)
  ● EncryptedFS Concurrency › concurrent file writes › EncryptedFS.lseek, EncryptedFS.writeFile, EncryptedFS.writeFile with fd, EncryptedFS.write, EncryptedFS.readFile, EncryptedFS.read, and seeking position
    Transaction is already committed
      588 |       return;
      589 |     }
    > 590 |     const gc = await tran.get<null>([
          |                           ^
      591 |       ...this.gcDbPath,
      592 |       inodesUtils.iNodeId(ino as INodeIndex),
      593 |     ]);
      at node_modules/@matrixai/db/src/utils.ts:307:9
      at Object.transactionGet (node_modules/@matrixai/db/src/utils.ts:292:12)
      at constructor_.get (node_modules/@matrixai/db/src/DBTransaction.ts:263:29)
      at constructor_.get (node_modules/@matrixai/async-init/src/CreateDestroy.ts:111:20)
      at constructor_.get (src/inodes/INodeManager.ts:590:27)
  ● EncryptedFS Concurrency › concurrent symlinking › EncryptedFS.symlink, EncryptedFS.symlink and EncryptedFS.mknod
    Transaction is already committed
      588 |       return;
      589 |     }
    > 590 |     const gc = await tran.get<null>([
          |                           ^
      591 |       ...this.gcDbPath,
      592 |       inodesUtils.iNodeId(ino as INodeIndex),
      593 |     ]);
      at node_modules/@matrixai/db/src/utils.ts:307:9
      at Object.transactionGet (node_modules/@matrixai/db/src/utils.ts:292:12)
      at constructor_.get (node_modules/@matrixai/db/src/DBTransaction.ts:263:29)
      at constructor_.get (node_modules/@matrixai/async-init/src/CreateDestroy.ts:111:20)
      at constructor_.get (src/inodes/INodeManager.ts:590:27)

This one doesn't seem to be related to the streaming. Looks like the commonality here is file/inode creation? I'll need to do some digging.

@CMCDragonkai
Copy link
Member

That messages comes from the C++ code, not from the JS code. It appears it's not actually wrapped in an error. There's a code for this that I'm not catching in the js-db cause I didn't expect to see these errors.

The error code is TRANSACTION_COMMITTED or TRANSACTION_ROLLBACKED.

#define ASSERT_TRANSACTION_READY_CB(env, transaction, callback)              \
  if (transaction->isCommitting_ || transaction->hasCommitted_) {            \
    napi_value callback_error = CreateCodeError(                             \
        env, "TRANSACTION_COMMITTED", "Transaction is already committed");   \
    NAPI_STATUS_THROWS(CallFunction(env, callback, 1, &callback_error));     \
    NAPI_RETURN_UNDEFINED();                                                 \
  }                                                                          \
  if (transaction->isRollbacking_ || transaction->hasRollbacked_) {          \
    napi_value callback_error = CreateCodeError(                             \
        env, "TRANSACTION_ROLLBACKED", "Transaction is already rollbacked"); \
    NAPI_STATUS_THROWS(CallFunction(env, callback, 1, &callback_error));     \
    NAPI_RETURN_UNDEFINED();                                                 \
  }

#define ASSERT_TRANSACTION_READY(env, transaction)                  \
  if (transaction->isCommitting_ || transaction->hasCommitted_) {   \
    napi_throw_error(env, "TRANSACTION_COMMITTED",                  \
                     "Transaction is already committed");           \
    NAPI_RETURN_UNDEFINED();                                        \
  }                                                                 \
  if (transaction->isRollbacking_ || transaction->hasRollbacked_) { \
    napi_throw_error(env, "TRANSACTION_ROLLBACKED",                 \
                     "Transaction is already rollbacked");          \
    NAPI_RETURN_UNDEFINED();                                        \
  }

@CMCDragonkai
Copy link
Member

See that I'm not even catching that:

»» ~/Projects/js-db/src
 ♜ ag 'TRANSACTION_'                                            (staging) pts/3 15:11:48
DBTransaction.ts
484:          if (e.code === 'TRANSACTION_CONFLICT') {

native/napi/utils.h
33:#define NAPI_TRANSACTION_CONTEXT() \
74:#define ASSERT_TRANSACTION_READY_CB(env, transaction, callback)              \
77:        env, "TRANSACTION_COMMITTED", "Transaction is already committed");   \
83:        env, "TRANSACTION_ROLLBACKED", "Transaction is already rollbacked"); \
88:#define ASSERT_TRANSACTION_READY(env, transaction)                  \
90:    napi_throw_error(env, "TRANSACTION_COMMITTED",                  \
95:    napi_throw_error(env, "TRANSACTION_ROLLBACKED",                 \

native/napi/index.cpp
830:  NAPI_TRANSACTION_CONTEXT();
831:  ASSERT_TRANSACTION_READY(env, transaction);
844:  NAPI_TRANSACTION_CONTEXT();
888:  NAPI_TRANSACTION_CONTEXT();
907:  NAPI_TRANSACTION_CONTEXT();
915:  ASSERT_TRANSACTION_READY_CB(env, transaction, callback);
927:  NAPI_TRANSACTION_CONTEXT();
935:  ASSERT_TRANSACTION_READY_CB(env, transaction, callback);
947:  NAPI_TRANSACTION_CONTEXT();
966:  NAPI_TRANSACTION_CONTEXT();
986:  NAPI_TRANSACTION_CONTEXT();
990:  ASSERT_TRANSACTION_READY_CB(env, transaction, callback);
1002:  NAPI_TRANSACTION_CONTEXT();
1005:  ASSERT_TRANSACTION_READY_CB(env, transaction, callback);
1014:  NAPI_TRANSACTION_CONTEXT();
1015:  ASSERT_TRANSACTION_READY(env, transaction);
1027:  NAPI_TRANSACTION_CONTEXT();
1028:  ASSERT_TRANSACTION_READY(env, transaction);
1059:  NAPI_TRANSACTION_CONTEXT();
1060:  ASSERT_TRANSACTION_READY(env, transaction);
1078:  NAPI_TRANSACTION_CONTEXT();
1079:  ASSERT_TRANSACTION_READY(env, transaction);

native/napi/worker.cpp
106:    argv = CreateCodeError(env, "TRANSACTION_CONFLICT", errMsg_);

The only code I expect at 484 is TRANSACTION_CONFLICT.

How did you manage to try to call tran.get() after the transaction is already committed?

@CMCDragonkai
Copy link
Member

I think I see that you're getting during garbage collection. Do note that garbage collection transaction should be independent from any user created filesystem operation.

@CMCDragonkai
Copy link
Member

You might want to review the GC logic again, cause I wrote that back when the DB didn't have fully consistent reads and writes. Might have missed something when updating to the new DB.

@maxwell-aiden
Copy link
Member Author

This was odd since it should've been caught by linting.
@maxwell-aiden
Copy link
Member Author

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 22, 2022

Probably worthwhile to also catch those errors and rethrow as JS errors since it appears to be possible if an asynchronous function is called with a reference to the transaction and the transaction gets committed. This sort of happens due to destroy and commit and rollback being separate things. I might "roll" commit and rollback into destroy function with just a type, and this might make things more robust. So like destroy('commit') or destroy('rollback') with it defaulted on rollback. Or make commit() and rollback() just call destroy appropriately.

@maxwell-aiden
Copy link
Member Author

@CMCDragonkai
Copy link
Member

Ok we have it working now. Changing to the native promise as the species actually makes the coder simpler since I can rely on PromiseCancellable.from to wrap the native promise.

    PromiseCancellable.then
      ✓ p3 = p1.then(() => p2) - p2 rejection propgates to p3 (104 ms)
      ✓ p3 = p1.then(() => p2) - p3 cancellation results in early rejection of p3 and p2 (101 ms)
      ✓ p3 = p1.then(() => p2) - delayed p3 cancellation results in early rejection of p3 and late rejection of p2 (102 ms)
      ✓ p3 = p1.then(() => p2) - p3 cancellation chained to p1 cancellation propagates rejection (27 ms)

Also the usage of catch() resolves the unhandled promise rejection issue.

@CMCDragonkai
Copy link
Member

The docs for finally seems to imply that it can return a rejected promise or throw an exception. If this is true, we may need to apply the catch() trick to the finally method as well. Perhaps also the catch method.

@CMCDragonkai CMCDragonkai added r&d:polykey:core activity 2 Cross Platform Cryptography for JavaScript Platforms r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management and removed r&d:polykey:core activity 2 Cross Platform Cryptography for JavaScript Platforms labels Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management
Development

Successfully merging this pull request may close these issues.

Streams don't seem to emit the close event
4 participants