-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock resync #608
Lock resync #608
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have other cases from #550 covered?
""" | ||
Lock objects should fill metabase on resync_metabase | ||
""" | ||
restart_storage_nodes(self.cluster.storage_nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not enough for the nspcc-dev/neofs-node#1502 case, metabase must be resynchronized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not enough for the nspcc-dev/neofs-node#1502 case, metabase must be resynchronized.
@roman-khimov @carpawell can you please recommend what is the way to make the metabase resynchronized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A configuration change is needed before the node restart, resync_metabase
should be set to true
, see https://github.com/nspcc-dev/neofs-node/blob/master/docs/storage-node-configuration.md#shard-subsection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend dropping metabase too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which means physically deleting metabase files (take path from the config). But it should be done on stopped node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roman-khimov @carpawell If I physically delete the metabase files, should I set resync_metabase to true?
Or should I change the configuration file in the dev-env https://github.com/nspcc-dev/neofs-dev-env/blob/master/services/storage/cfg/config.yml#L39 ?
Why is resync_metabase false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I physically delete the metabase files, should I set resync_metabase to true?
Yes.
Why is resync_metabase false?
It's never true
in production, this doesn't make much sense, it's only needed when there are some problems with it (storage failure or meta-incompatible node upgrade). So dev-env should have it set to false
as usual, it's only some specific test that wants to do something. BTW, maybe it'd be easier for you to use environment variable to override this setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After deleting the metabase files from all node object deletion fails with an error:
gRPC dial: context deadline exceeded
@roman-khimov @carpawell please tell me, is this expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a too-wide error, usually, it means that a node is not ready to handle connections (not started successfully?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
No, the other tests will be added in the next PR. |
132d7fa
to
50a4c29
Compare
5c18773
to
a7e75dd
Compare
Needs to be rebased |
d2fbb00
to
1bc64a5
Compare
Fixed |
|
||
modified_lines = [] | ||
|
||
for line in lines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, since that is a tricky hack, isn't appending the file with the required envs an easier approach? or renaming the original file, reading it, creating a new file (and appending the needed lines to it), and then renaming it back?
but that is just a reading simplification. mb you think differently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, since that is a tricky hack, isn't appending the file with the required envs an easier approach? or renaming the original file, reading it, creating a new file (and appending the needed lines to it), and then renaming it back?
but that is just a reading simplification. mb you think differently
Yeah, I've been thinking about it.
However, it turns out that I have to copy the docker-compose.yml file from dev-env and change the path to the new file with new environment variables.
It seems to me that this will complicate the code too much and make it incomprehensible.
|
||
@allure.title("Locked object should be protected from deletion after the storage nodes are restarted") | ||
@pytest.mark.parametrize( | ||
# Only complex objects are required for this test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
Is it important to check the simple object here as well? If you think it's important, I'll add simple_object_size too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic behind the complex objects are complicated enought to treat them an absolutelly different way of object storage, so i would expect testing both always. i have fixed both the bugs that relate the simple objects only and the bugs that affects chains of the objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic behind the complex objects are complicated enought to treat them an absolutelly different way of object storage, so i would expect testing both always. i have fixed both the bugs that relate the simple objects only and the bugs that affects chains of the objects
Fixed
1bed9c2
to
19caf62
Compare
16f0083
to
0792593
Compare
This commit rolls back e1aed14 It was a mistake to make that change. Now it is not necessary, nspcc-dev/neofs-testcases#608 allows you to do without a separate flag. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
This commit rolls back 39f2bcf It was a mistake to make that change. Now it is not necessary, nspcc-dev/neofs-testcases#608 allows you to do without a separate flag. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
This commit rolls back 39f2bcf It was a mistake to make that change. Now it is not necessary, nspcc-dev/neofs-testcases#608 allows to do without a separate flag. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
This commit rolls back e1aed14 It was a mistake to make that change. Now it is not necessary, nspcc-dev/neofs-testcases#608 allows to do without a separate flag. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
Colleagues, thank you very much for noticing and participating in the discussion. In order for the tests to work correctly, we need to merge these 2 PRs first: @carpawell apologies to you - we made changes to the .env files before your review, if we had waited we wouldn't have had to revert those changes now. |
0792593
to
c18302b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But OK otherwise. @carpawell, @evgeniiz321?
Add DOCKER_COMPOSE_ENV_FILE, DOCKER_COMPOSE_CONFIG_FILE and METABASE_RESYNC_TIMEOUT Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
Added a functions that stop and restarts the specified storage nodes. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
Added a function to delete metadata from host for specified node. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
Added a function to enable metabase resync on start. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
The test_the_object_lock_should_be_kept_after_metabase_deletion test added. This test verifies the issue nspcc-dev/neofs-node#1502 Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
The test_enable_resync_metabase_delete_metadata and the test_delete_metadata tests added. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
c18302b
to
d1a7097
Compare
This commit rolls back e1aed14 It was a mistake to make that change. Now it is not necessary, nspcc-dev/neofs-testcases#608 allows to do without a separate flag. Signed-off-by: Oleg Kulachenko <oleg@nspcc.ru>
No description provided.