Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[faucet] Improving observability and throughput #7091

Merged
merged 2 commits into from
Jan 3, 2023

Conversation

amnn
Copy link
Member

@amnn amnn commented Jan 3, 2023

A number of changes to improve the quality of metrics we receive from the faucet, and also reduce bottlenecking at the core of the service, to achieve a higher transaction throughput:

Fixes to SimpleFaucet::prepare_gas_coin

This function had a confusing control-flow that hid a few bugs:

  • The consumer lock was held for its entire body, when it was only needed at the beginning (to recieve a Coin ID). Fixed by factoring out the part of the function that needs to hold that consumer lock into pop_gas_coin.
  • If the consumer lock timed out, the function produced an error that was never used. Fixed by deleting this unused error message.
  • The rest of the body is contained within an unconditional loop which always runs exactly once. Fixed by removing the loop and simplifying the control-flow.
  • The control-flow path that discards a coin because it has insufficient balance did not update the total_discarded_coins metric. Fixed by moving the logic for updating this metric to transfer_gases which is responsible for handling the return from prepare_gas_coin -- this is also the function that is responsible for recycling coins that should be kept.

Standardise format for Coin IDs in log messages

  • Output them as a coin_id label that can be parsed and queried against, rather than inline in the message.

Improvements to RequestMetrics middleware

  • Detect and count disconnects, printing their elapsed time in the logs.
  • Treat non-successful (2xx) HTTP Responses as failed requests

Test Plan:

Unit tests:

sui$ cargo nextest run

Test the various new features on a local instance of the faucet, by:

 # Starting validators
sui$ cargo build
$ rm -rf ~/.sui                        \
  && ~/sui/target/debug/sui genesis -f \
  && ~/sui/target/debug/sui start

 # Starting faucet (new terminal session)
$ ~/sui/target/debug/sui-faucet \
  --request-buffer-size 2       \
  --max-request-per-second 1

 # Sending a faucet request (excuse fish shell syntax)
$ set -l ADDRESS (yq -r .active_address ~/.sui/sui_config/client.yaml | sed 's/0x//' | string upper)
  curl --location --request POST 'http://127.0.0.1:5003/gas' \
      --header 'Content-Type: application/json'              \
      --data-raw "{ \"FixedAmountRequest\": { \"recipient\": \"$ADDRESS\" } }"
  • Test detecting request failure by replacing the body of SimpleFaucet::pop_gas_coin with None
  • Test load shedding by running multiple faucet requests in while loops (three is sufficient)
  • Test disconnection metrics by adding a sleep command within SimpleFaucet::prepare_gas_coin and interrupting the curl command before it finishes.

A number of changes to improve the quality of metrics we receive from
the faucet, and also reduce bottlenecking at the core of the service,
to achieve a higher transaction throughput:

Fixes to `SimpleFaucet::prepare_gas_coin`:

This function had a confusing control-flow that hid a few bugs:

- The consumer lock was held for its entire body, when it was only
  needed at the beginning (to recieve a Coin ID).  Fixed by factoring
  out the part of the function that needs to hold that consumer lock.
- If the consumer lock timed out, the function produced an error that
  was never used.  Fixed by deleting this unused error message.
- The rest of the body is contained within an unconditional loop which
  always runs exactly once.  Fixed by removing the loop and
  simplifying the control-flow.
- The control-flow path that discards a coin because it has
  insufficient balance did not update the `total_discarded_coins`
  metric.  Fixed by moving the logic for updating this metric to
  `transfer_gases` which is responsible for handling the return from
  `prepare_gas_coin` -- this is also the function that is responsible
  for recycling coins that should be kept.

Standardise format for Coin IDs in log messages:

- Output them as a `coin_id` label that can be parsed and queried
  against, rather than inline in the message.

Improvements to RequestMetrics middleware:

- Detect and count disconnects, printing their elapsed time in the logs.
- Treat non-successful (2xx) HTTP Responses as failed requests

Test Plan:

Unit tests:
```
sui$ cargo nextest run
```

Test the various new features on a local instance of the faucet, by:

```
 # Starting validators
sui$ cargo build
$ rm -rf ~/.sui                        \
  && ~/sui/target/debug/sui genesis -f \
  && ~/sui/target/debug/sui start

 # Starting faucet (new terminal session)
$ ~/sui/target/debug/sui-faucet \
  --request-buffer-size 2       \
  --max-request-per-second 1

 # Sending a faucet request (excuse fish shell syntax)
$ set -l ADDRESS (yq -r .active_address ~/.sui/sui_config/client.yaml | sed 's/0x//' | string upper)
  curl --location --request POST 'http://127.0.0.1:5003/gas' \
      --header 'Content-Type: application/json'              \
      --data-raw "{ \"FixedAmountRequest\": { \"recipient\": \"$ADDRESS\" } }"
```

- Test detecting request failure by replacing the body of
  `SimpleFaucet::pop_gas_coin` with `None`
- Test load shedding by running multiple faucet requests in `while`
  loops (three is sufficient)
- Test disconnection metrics by adding a sleep command within
  `SimpleFaucet::prepare_gas_coin` and interrupting the `curl` command
  before it finishes.
@amnn amnn added observability Observability for Sui, Narwhal, etc. Logging/metrics/alerting/tracing/events faucet labels Jan 3, 2023
@amnn amnn self-assigned this Jan 3, 2023
@vercel
Copy link

vercel bot commented Jan 3, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Ignored Deployments
Name Status Preview Comments Updated
explorer-storybook ⬜️ Ignored (Inspect) Jan 3, 2023 at 6:14PM (UTC)
wallet-adapter ⬜️ Ignored (Inspect) Jan 3, 2023 at 6:14PM (UTC)

Copy link
Contributor

@tnowacki tnowacki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Faucet code looks good to me. Though I'm not very familiar with it. And I'm even less familiar with the metrics layer stuff.

Approving as it looks like a good improvement on the faucet side, and mostly harmless on the metrics side, though you might want another set of eyes

@amnn
Copy link
Member Author

amnn commented Jan 3, 2023

Thanks for the review @tnowacki! I wrote the metrics layer stuff previously, so it should be okay (at least in the sense that the change isn't entirely without context).

@amnn amnn merged commit e0a5e0b into MystenLabs:main Jan 3, 2023
@amnn amnn deleted the faucet-detect-error branch January 3, 2023 19:11
ebmifa pushed a commit that referenced this pull request Jan 3, 2023
A number of changes to improve the quality of metrics we receive from
the faucet, and also reduce bottlenecking at the core of the service, to
achieve a higher transaction throughput:

### Fixes to `SimpleFaucet::prepare_gas_coin`

This function had a confusing control-flow that hid a few bugs:

- The consumer lock was held for its entire body, when it was only
needed at the beginning (to recieve a Coin ID). Fixed by factoring out
the part of the function that needs to hold that consumer lock into
`pop_gas_coin`.
- If the consumer lock timed out, the function produced an error that
was never used. Fixed by deleting this unused error message.
- The rest of the body is contained within an unconditional loop which
always runs exactly once. Fixed by removing the loop and simplifying the
control-flow.
- The control-flow path that discards a coin because it has insufficient
balance did not update the `total_discarded_coins` metric. Fixed by
moving the logic for updating this metric to `transfer_gases` which is
responsible for handling the return from `prepare_gas_coin` -- this is
also the function that is responsible for recycling coins that should be
kept.

### Standardise format for Coin IDs in log messages

- Output them as a `coin_id` label that can be parsed and queried
against, rather than inline in the message.

### Improvements to RequestMetrics middleware

- Detect and count disconnects, printing their elapsed time in the logs.
- Treat non-successful (2xx) HTTP Responses as failed requests

## Test Plan:

Unit tests:
```
sui$ cargo nextest run
```

Test the various new features on a local instance of the faucet, by:

```
 # Starting validators
sui$ cargo build
$ rm -rf ~/.sui                        \
  && ~/sui/target/debug/sui genesis -f \
  && ~/sui/target/debug/sui start

 # Starting faucet (new terminal session)
$ ~/sui/target/debug/sui-faucet \
  --request-buffer-size 2       \
  --max-request-per-second 1

 # Sending a faucet request (excuse fish shell syntax)
$ set -l ADDRESS (yq -r .active_address ~/.sui/sui_config/client.yaml | sed 's/0x//' | string upper)
  curl --location --request POST 'http://127.0.0.1:5003/gas' \
      --header 'Content-Type: application/json'              \
      --data-raw "{ \"FixedAmountRequest\": { \"recipient\": \"$ADDRESS\" } }"
```

- Test detecting request failure by replacing the body of
`SimpleFaucet::pop_gas_coin` with `None`
- Test load shedding by running multiple faucet requests in `while`
loops (three is sufficient)
- Test disconnection metrics by adding a sleep command within
`SimpleFaucet::prepare_gas_coin` and interrupting the `curl` command
before it finishes.
amnn added a commit to amnn/sui that referenced this pull request Jan 5, 2023
Lost during the refactor in MystenLabs#7091, which ironically fixed another
accounting bug in the faucet.

Test Plan:

Updated the unit tests to catch both this bug and the previous bug
that was fixed:

```
sui/crates/sui-faucet$ cargo nextest run
```
amnn added a commit that referenced this pull request Jan 5, 2023
Lost during the refactor in #7091, which ironically fixed another
accounting bug in the faucet.

## Test Plan

Updated the unit tests to catch both this bug and the previous bug that
was fixed:

```
sui/crates/sui-faucet$ cargo nextest run
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
faucet observability Observability for Sui, Narwhal, etc. Logging/metrics/alerting/tracing/events
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants