Skip to content
This repository was archived by the owner on Aug 2, 2022. It is now read-only.

Conversation

@linhuang-blockone
Copy link
Contributor

@linhuang-blockone linhuang-blockone commented May 15, 2020

Change Description

This PR resolves https://blockone.atlassian.net/projects/EPE/issues/EPE-43.

Background

It was reported from a customer that when file system which data/blocks belongs to is
running out of space, the producer continued to produce blocks and update state but
blocks.log was corrupted in a sense that it no longer contained all the irreversible blocks.
It was also observed that when file system which data/state belongs is running out of space, nodeos will crash with SIGBUS as the state file is unable to acquire new pages.

Design

The solution is to have a dedicated plugin to monitor resource usages (file system space for now,
CPU, memory, and networking bandwidth in the future). The plugin uses a thread to periodically
check space usage of file systems of directories being monitored. If space used
is over a predefined threshold, a graceful shutdown is initiated.

The directories to be monitored are configured. They can be any paths for of blocks, state, trace, state history, and others.

Alternative Solution

Alternative solution to blocks directory space exhaustion was considered: starting from
block_log::append, make sure all exceptions was handled properly. But it was deemed to be
too tedious and error prone to review all possible exceptions, and might likely break consensus. The plugin approach is more generic, monitoring any file systems and without modifying
existing code.

Additional Bug Fixed

This PR also fixes a bug in log_irreversible of libraries/chain/controller.cpp.
db.commit was done before blog.append. This could be a problem if blog.append threw.
Now db.commit is done after blog.append. Even if blog.append throws, DB can be rolled back.

Consensus Changes

  • Consensus Changes

API Changes

  • API Changes

Documentation Additions

  • Documentation Additions

resource_monitor_plugin is a new plugin. It is not must have but recommended to be used.

Configurable Parameters

resource-monitor-interval-seconds
Time in seconds between two consecutive checks of resource usage. It should be between 1 and 300 (5 minutes). This is optional; the default value is 2 seconds.

resource-monitor-space-threshold
Threshold in terms of percentage of used space vs total space. If used space is above (threshold - 5%), a warning is generated. If used space is above the threshold and resource-monitor-not-shutdown-on- threshold-exceeded is enabled, a graceful shutdown is initiated. The value should be between 6 and 99

resource-monitor-not-shutdown-on-threshold-exceeded
Used to indicate nodeos will not shutdown when threshold is exceeded.

@matthewdarwin
Copy link

I'm not sure if 95% is a good default value for file system space. By default some file systems reserve 5% for superuser. So if nodeos is running as a non-privileged user, then writes will fail before the new check will fail. (eg https://ma.ttias.be/change-reserved-blocks-ext3-ext4-filesystem-linux/)

Of course the check is customizable, but maybe more conservative would be better as a default?

@spoonincode
Copy link
Contributor

I'm a bit concerned this isn't easy enough for users to make use of: it requires them to manually configure the directories to monitor.

IMO we should explore having other plugins register their directories. For example, maybe trace plugin does something like

if(auto resmon_plugin = app().find_plugin<resource_monitor_pligin>())
  resmon_plugin->monitor_directory(trace_dir);

This way the directories that are used are automatically monitored (we could maybe even make resource_monitor_plugin default enabled).

Be aware in the above example we would need to be mindful about plugin initialization and startup order -- trace_api_plugin configures trace_dir during its initialize() but that may be before resource_monitor_plugin's initialize().

@linhuang-blockone
Copy link
Contributor Author

I was thinking of always monitoring anything under "data" directory but then thought the user
might have better idea about file system space configurations. If we choose auto registration by other plugins, is the order of plugins initialization the same as they are specified on the command line? Or can we make the initialization of resource monitor the last so it can get "data_dir", "trace_dir", and any other directories we are interested in? This way we don't need to modify other plugin's code.

Copy link
Contributor

@b1bart b1bart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, this system is not that complex however, the description mentions a path for future additions. As this grows it will need a strategy for testability.

In other parts of the code we have tried to abstract dependencies (in this case boost::filesystem calls) from the code so that unit tests can create very simple "mocks" for nominal and edge cases. Particularly as we start to measure other system stats we may have a very difficult time triggering edge cases safely.

As a model, you can look at the trace-api plugin and the template-based "provider pattern" or pull from other sources of patterns. The primary point is to allow for controlled repeatable testing of the logic that will ultimately product a production environment. How this is accomplished is a secondary or tertiary concern.

@spoonincode
Copy link
Contributor

If we choose auto registration by other plugins, is the order of plugins initialization the same as they are specified on the command line? Or can we make the initialization of resource monitor the last so it can get "data_dir", "trace_dir", and any other directories we are interested in?

The specifics will have to be investigated and studied. I believe the plugin object gets instantiated as a global constructor so all plugins -- even plugins not enabled -- will be available via get_plugin<>() as soon as main() starts. That is a little bizarre and maybe not what appbase originally intended. Regardless, it means something like

if(auto resmon_plugin = app().find_plugin<resource_monitor_pligin>())
  resmon_plugin->monitor_directory(trace_dir);

may call in to resource_monitor_plugin before resource_monitor_plugin::initialize(). Or it might call in to it after resource_monitor_plugin::initialize(). resource_monitor_plugin would have to be prepared to handle both cases

@heifner
Copy link
Contributor

heifner commented May 20, 2020

What you do know is at time of plugin_startup, all plugin_initialize have been called.

@linhuang-blockone
Copy link
Contributor Author

I think I can add a function to be called by resource monitor's plugin_startup.
The function will read other's plugin's configured directories we want to monitor plus
app().data_dir(). This way we don't need to change other plugin's code.

… and integration test cases, terminate thread right after plugin shutdown, and make apps register monitored directories
EOS_ASSERT(threshold > 0 && threshold < 100, chain::plugin_config_exception,
"\"resource-monitor-space-threshold\" must be greater than 0 and less than 100");

space_handler.set_threshold(threshold);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the assert into the set_threshold method so you don't have to guard against overflow in your calculation.

Copy link
Contributor Author

@linhuang-blockone linhuang-blockone Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to capture invalid input as early as possible.

@spoonincode
Copy link
Contributor

Should the plugin be default enabled? Seems like now that it is monitoring all the directories automatically that would be desirable?

…n test from Bash to Python next), make shutdown on exceeding threshold as default, increase interval upper bound to 5 minutes
…e_monitor_plugin_impl, changed integration test script from Bash to Python
…and relax timing requirements for test cases
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants