Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MAINTENANCE state of storage nodes #1798

Merged

Conversation

cthulhu-rider
Copy link
Contributor

@cthulhu-rider cthulhu-rider commented Sep 19, 2022

I'm NOT going to cover Replication chapter of #1680 since it's better to be done in a separate PR.

I also decided to postpone the final solution of blocking local Object service in maintenance to #1834.

@cthulhu-rider cthulhu-rider self-assigned this Sep 19, 2022
@codecov
Copy link

codecov bot commented Sep 19, 2022

Codecov Report

Merging #1798 (d9261bb) into master (4eb0ed1) will decrease coverage by 0.04%.
The diff coverage is 22.15%.

@@            Coverage Diff             @@
##           master    #1798      +/-   ##
==========================================
- Coverage   31.27%   31.23%   -0.05%     
==========================================
  Files         374      375       +1     
  Lines       26411    26438      +27     
==========================================
- Hits         8261     8257       -4     
- Misses      17458    17494      +36     
+ Partials      692      687       -5     
Impacted Files Coverage Δ
...md/neofs-adm/internal/modules/morph/remove_node.go 0.00% <0.00%> (ø)
cmd/neofs-node/config.go 0.00% <0.00%> (ø)
cmd/neofs-node/netmap.go 0.00% <0.00%> (ø)
cmd/neofs-node/object.go 0.00% <0.00%> (ø)
pkg/innerring/processors/netmap/process_epoch.go 0.00% <0.00%> (ø)
pkg/innerring/processors/netmap/process_peers.go 0.00% <0.00%> (ø)
pkg/innerring/processors/netmap/processor.go 0.00% <0.00%> (ø)
pkg/morph/client/netmap/config.go 0.00% <0.00%> (ø)
pkg/morph/client/netmap/snapshot.go 0.00% <0.00%> (ø)
pkg/morph/client/netmap/update_state.go 0.00% <0.00%> (ø)
... and 10 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@cthulhu-rider cthulhu-rider force-pushed the feature/1681-ir-netmap-control-states branch from db61157 to 7936515 Compare September 19, 2022 12:42
@cthulhu-rider cthulhu-rider changed the title Change verification of node states on Inner Ring side Support MAINTENANCE state of storage nodes Sep 19, 2022
@cthulhu-rider cthulhu-rider force-pushed the feature/1681-ir-netmap-control-states branch 4 times, most recently from 9909bdb to f900342 Compare September 30, 2022 13:43
@cthulhu-rider cthulhu-rider force-pushed the feature/1681-ir-netmap-control-states branch from f900342 to 2d54506 Compare September 30, 2022 13:45
@cthulhu-rider cthulhu-rider marked this pull request as ready for review September 30, 2022 15:08
pkg/morph/client/netmap/update_state.go Outdated Show resolved Hide resolved
pkg/morph/client/netmap/update_state.go Outdated Show resolved Hide resolved
Copy link
Member

@carpawell carpawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming all the new (in that PR) NetMap to Netmap

//
// See also netmap.NodeInfo.IsOnline/SetOnline.
func (x *NetMapCandidateValidator) VerifyAndUpdate(node *netmap.NodeInfo) error {
if node.IsOnline() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little off top: why don't we have state getters in SDK? it would allow logging the exact bootstrap state and make debugging easier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For debugging we can provide func (NodeInfo) StringifyState() string method.

pkg/morph/event/netmap/update_peer.go Outdated Show resolved Hide resolved
pkg/innerring/netmap.go Outdated Show resolved Hide resolved
*/

// wraps Netmap contract's client and provides state.NetworkSettings.
type networkSettings netmapclient.Client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be the client? how about a cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be the client?

What do you mean?

how about a cache?

#1835

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

current implementation does not allow caching => dont get why we have a separate type if we dont use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because netmapclient.Client does not provide state.NetworkSettings by design.

pkg/morph/client/netmap/config.go Show resolved Hide resolved
pkg/morph/event/netmap/update_peer.go Outdated Show resolved Hide resolved
pkg/morph/event/netmap/update_peer_notary.go Outdated Show resolved Hide resolved
fyrchik
fyrchik previously approved these changes Oct 4, 2022
pkg/services/object/get/service.go Outdated Show resolved Hide resolved
@cthulhu-rider cthulhu-rider force-pushed the feature/1681-ir-netmap-control-states branch 2 times, most recently from ce88647 to a686619 Compare October 4, 2022 13:53
@cthulhu-rider
Copy link
Contributor Author

Suggest renaming all the new (in that PR) NetMap to Netmap

I absolutely don't like introducing such changes partially. Lets postpone #1836.

Leonard Lyubich added 9 commits October 4, 2022 18:01
… only

In previous implementation Inner Ring allowed storage nodes with any
state to register in the network. According to the current design, only
nodes with ONLINE state are allowed to enter the network map.

Create new `state` sub-package of `nodevalidation` package of Inner Ring
application. Define `state.NetMapCandidateValidator` type and provide
`NodeValidator` interface required by the Inner Ring's processor of
`Netmap` contract's notification events. Embed new validator into the
one used by the Inner Ring application.

From now all `AddPeer` notifications with node state other than `ONLINE`
will be denied.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
… server

In previous implementation storage node interpreted all status values
sent in `SetNetmapStatus` RPC as `OFFLINE` except `ONLINE` and
`MAINTENANCE`. This could lead to incorrect processing of new values,
and also didn't allow detection of problems with sending garbage values.

Make implementation of `NodeState` interface used by Control API server
to deny requests with statuses other than protocol-declared enum.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
New network status of storage nodes is going to be introduced. To
simplify the addition, it would be useful to prepare the code for this.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
… network

There is a need to prevent limitless abuse of MAINTENANCE status of the
storage nodes. To do this, configuration of the NeoFS network is going
to be extended with the flag which allows the state. Until this is done,
it makes sense to prepare a site for this in the code.

Define `state.NetworkSettings` interface as an abstraction of global
network configuration within the `state` package. Make
`NetMapCandidateValidator` to depend on `NetworkSettings` and provide
corresponding field setter. Change `VerifyAndUpdate` method's behavior
to return an error for candidates with MAINTENANCE state if this state
is disallowed by the network configuration. Provide `NetworkSettings`
from the wrapper over Netmap contract's client on Inner Ring application
side. The provider is implemented to statically disallow MAINTENANCE
mode in order to save previous behavior.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
…tions

`readBoolConfig` method is going to be reused for reading other
configuration values. All boolean settings are `false` by default, so it
makes sense to return default value on missing key directly from
`readBoolConfig`.

Handle `ErrConfigNotFound` case in `readBoolConfig` method. Change
`HomomorphicHashDisabled` method to call `readBoolConfig` directly.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
…onfig

`NetworkConfiguration` represents NeoFS network configuration stored in
the Sidechain. In previous implementation the configuration missed flag
of disabled homomorphic hashing.

Add `NetworkConfiguration.HomomorphicHashingDisabled` boolean field.
Decode the field in `Client.ReadNetworkConfiguration` method. Print this
value in `netmap netinfo` command of NeoFS CLI.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
After recent changes in the NeoFS API protocol network configuration
contains `MaintenanceModeAllowed` boolean flag. There is a need to
support the config value in all NeoFS applications.

Provide `Client.MaintenanceModeAllowed` method which read the config
from the Sidechain. Extend `NetworkConfiguration` structure with
`MaintenanceModeAllowed` field.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
… in RPC

After recent changes network configuration stored in the Netmap contract
of the NeoFS Sidechain contains `MaintenanceModeAllowed` flag. There is
a need to support this value in Storage node application.

Make `NetmapService.NetworkInfo` RPC server of the storage node to set
`MaintenanceModeAllowed` flag according to corresponding value in the
Sidechain.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
After recent changes network configuration provided by NeoFS storage
nodes contains `MaintenanceModeAllowed` flag. There is
a need to support this value in NeoFS CLI application.

Print `MaintenanceModeAllowed` flag in `netmap netinfo` command.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Leonard Lyubich and others added 10 commits October 4, 2022 18:01
Inner Ring should allow registering of storage nodes with `MAINTENANCE`
state in the NeoFS network only if its configuration allows this status.

Make `networkSettings.MaintenanceModeAllowed` to call
`MaintenanceModeAllowed` method of underlying Netmap contract's client
in order to assert state allowance.

From now nodes will be accepted to the network with `MAINTENANCE` state
only with the appropriate network configuration.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
After recent changes in NeoFS protocol storage nodes can be in
`MAINTENANCE` state. There is a need to support this state in
`UpdateState` operation.

Add `UpdatePeerPrm.SetMaintenance` method which makes node to be
switched into `MAINTENANCE` mode after the `UpdatePeerState` operation.

New functionality is going to be used in Storage node application for
Control API serving.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
After recent changes Netmap contract can send `UpdateState` notification
event with `MAINTENANCE` node's state. There is a need to provide
functionality to work with the status.

Provide `UpdatePeer.Maintenance` method. Support new state in
`ParseUpdatePeer` and `ParseUpdatePeerNotary` functions.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Storage node can be requested to be switched into `MAINTENANCE` state.
Inner Ring should accept such requests only if network configuration
allows it.

Make `Processor` of Netmap contract's notifications to depend on
`state.NetworkSettings`. Make `Processor.processUpdatePeer` to call
`MaintenanceModeAllowed` if notification event relates to `MAINTENANCE`
mode`. Share singe `state.NetworkSettings` provider in Inner Ring
application.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
…` state

After recent changes `MAINTENANCE` state is reflected in the Sidechain.
Storage node should switch its state to "maintenance" during serving the
`ControlService.SetNetmapStatus` RPC with correspoding status in the
request.

Call `UpdatePeerState` operation of Netmap contract's client in
`control.NodeState` provider on Storage node app side. The op is
executed if `BlockExecution` on local object storage is succeeded.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Make storage node to return `NODE_UNDER_MAINTENANCE` status
error on each local object operation if the node is in `MAINTENANCE`
mode.

Pass `apistatus.NodeUnderMaintenance` to `StorageEngine.BlockExecution`
during `ControlService.SetNetmapStatus` RPC processing.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
After recent Netmap contract changes all read methods which return
network map (either candidates or snapshots) encode node descriptors
into same structure.

Decode `netmap.Node` contract-side structure from the call results.
Replace node state with the value from the `netmap.Node.State` field.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Make `netmap snapshot` command to print `MAINTENANCE` state of the nodes
with `IsMaintenance()` flag set.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
In previous implementation node blocked any operation of local object
storage in maintenance mode. There is a need to perform some storage
operations like data evacuation or restoration.

Do not call block storage engine in maintenance mode. Make all Object
service operations to return `apistatus.NodeUnderMaintenance` error from
each local op.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
@carpawell
Copy link
Member

I absolutely don't like introducing such changes partially

Me neither. But it is the second PR that is doing that. All the previous structs/pkgs/comments consider "netmap" as a single word.

Lets postpone #1836.

Ok. Did not know we have an issue for that.

@fyrchik fyrchik merged commit 713aea0 into nspcc-dev:master Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants