Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
MMS Test Suites
"""
223 changes: 223 additions & 0 deletions tests/performance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Performance Regression Suite

This test suite helps in running the load tests and monitoring the process and system wide metrics. It allows to specify the pass/fail criteria for metrics in the test case.
We use Taurus with JMeter as a test automation framework to run the test cases and metrics monitoring.

## How to run the test suite
To run the test suite you need to execute the [run_perfomance_suite.py](run_perfomance_suite.py). You will have to provide the artifacts-dir path to store the test case results.
You can specify test cases to be run by providing 'test-dir' with default as '$MMS_HOME/tests/performance/tests' and 'pattern' with default as '*.yaml'. For other options use '--help' option.

Script does the following:
1. Optionally but by default starts the metrics monitoring server
2. Collects all the test yamls from test-dir satisfying the pattern
3. Executes test yamls
4. Generates Junit XML and HTML report in artifacts-dir.

### A. Installation Prerequisites
1. Install Taurus. The Taurus needs Python3 but since your tests and MMS instance can run in different virtual environement or machine,
you can configure system such that tests are running on Python3 and MMS instance can run on Python 2 or 3.
Refer the [link](https://gettaurus.org/docs/Installation/) for more details on installation.
```bash
pip install bzt # Needs python3.6+
```
2. Install other dependencies.
```bash
export MMS_HOME=<MMS_HOME_PATH>
pip install -r $MMS_HOME/tests/performance/requirements.txt
```

### B. Running the test suite
1. Run MMS server
2. Make sure parameters set in the [global_config.yaml](tests/common/global_config.yaml) are correct.
3. Run the test suite runner script
4. Check the console logs, $artifacts-dir$/junit.html report and other artifacts.

**steps are provided below**
```bash
export MMS_HOME=<MMS_HOME_PATH>
cd $MMS_HOME/tests/performance

# Run the command below in different terminal to start MMS
# multi-model-server --start

# check variables
#vi tests/common/global_config.yaml
# jpeg download command for quick reference. Set input_filepath in global_config.yaml
#curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

python run_perfomance_suite.py --artifacts-dir='<path>' --pattern='*criteria*.yaml'
```

### C. Understanding the test suite artifacts and reports
1. The $artifacts-dir$/junit.html contains the summary report of the test run. Note that each test yaml is treated as a
test suite. Different criteria in the yaml are treated as test cases. If criteria is not specified in the yaml, test suite is marked as skipped with 0 test cases.
2. For each test yaml a sub-directory is created with artifacts for it.



## How to add test case to test suite.

To add test case follow steps below.
1. Add scenario
2. Add metrics to monitor
3. Add pass/fail criteria


#### 1. Add scenario
You can specify the test scenarios, in the scenario section of the yaml.
To get you started quickly, we have provided a sample JMeter script and a Taurus yaml file [here](tests/register_and_inference.jmx) and [here](tests/call_jmx.yaml) .

Here is how the sample call_jmx.yaml looks like. Note variables used by jmx script are specified in [global_config.yaml](tests/common/global_config.yaml) file.

```yaml
execution:
- concurrency: 1
ramp-up: 1s
hold-for: 40s
scenario: Inference

scenarios:
Inference:
script: register_and_inference.jmx

```

To run this individual test using Taurus(bzt) run commands below:

```bash
export MMS_HOME=<MMS_HOME_PATH>
cd $MMS_HOME/tests/performance
bzt tests/call_jmx.yaml tests/common/global_config.yaml
```

**Note**:
Taurus provides support for different executors such as JMeter. You can use test script written in those frameworks as it is.
Details about executor types are provided [here](https://gettaurus.org/docs/ExecutionSettings/).
Details about how to run an existing JMeter script are provided [here](https://gettaurus.org/docs/JMeter/).


#### 2. Add metrics to monitor
You can specify different metrics in services/monitoring section of the yaml.
Metrics can be monitored in two ways:
1. Standalone monitoring server

If your MMS server is hosted on different machine, you will be using this method. Before running the test case
you have to start a [metrics_monitoring_server.py](metrics_monitoring_server.py) script. It will be communicating with Taurus test client over sockets.
The address and port(default=9009) of the monitoring script should be specified in test case yaml.

**Note**: For available metrics check AVAILABLE_METRICS in the script [metric/__init__.py](metrics/__init__.py).
**Note**: While running Test suite runner script, no need to manually start the monitoring server. The scripts optionally but by default starts and stops it in setup and teardown.

To start monitoring server run commands below:
```bash
export MMS_HOME=<MMS_HOME_PATH>
pip install -r $MMS_HOME/tests/performance/requirements.txt
python $MMS_HOME/tests/performance/metrics_monitoring_server.py --start
```

Sample yaml with monitoring section config. Complete yaml can be found [here](tests/inference_server_monitoring.yaml)

```yaml
services:
- module: monitoring
server-agent:
- address: localhost:9009 # metric monitoring service address
label: mms-inference-server # if you specify label, it will be used in reports instead of ip:port
interval: 1s # polling interval
logging: True # those logs will be saved to "SAlogs_192.168.0.1_9009.csv" in the artifacts dir
metrics: # metrics should be supported by monitoring service
- sum_cpu_percent # cpu percent used by all the mms server processes and workers
- sum_memory_percent
- sum_num_handles
- server_workers # no of mms workers
```

Use Taurus command below to run the test yaml and observe the Metrics widget on CLI live report.

```bash
export MMS_HOME=<MMS_HOME_PATH>
cd $MMS_HOME/tests/performance
bzt tests/inference_server_monitoring.yaml tests/common/global_config.yaml
```


2. Taurus local monitoring plugin

If your test client is running on the server itself, you may want to use this method.
We have provided a custom Taurus plugin as [metrics_monitoring_taurus.py](metrics_monitoring_taurus.py).

**Note**: To know the list of supported/available metrics check [here](metrics_monitoring_taurus.py)
**Note**: While running Test suite runner script, no need to manually update the PYTHONPATH. The scripts updates it.

Use commands below to update PYTHONPATH so that plugin gets picked up by Taurus.

```bash
export MMS_HOME=<MMS_HOME_PATH>
export PYTHONPATH=$MMS_HOME/tests/performance:$PYTHONPATH
```

Relevant test yaml sections. Test yaml can be found [here](tests/inference_taurus_local_monitoring.yaml)

```yaml
modules:
server_local_monitoring:
# metrics_monitoring_taurus and dependencies should be in python path
class : metrics_monitoring_taurus.Monitor # monitoring class.

services:
- module: server_local_monitoring # should be added in modules section
ServerLocalClient: # keyword from metrics_monitoring_taurus.Monitor
- interval: 1s
metrics:
- cpu
- disk-space
- mem
- sum_memory_percent

```

Use Taurus command below to run the test yaml and observe the Metrics widget on CLI live report.

```bash
export MMS_HOME=<MMS_HOME_PATH>
cd $MMS_HOME/tests/performance
bzt tests/inference_taurus_local_monitoring.yaml tests/common/global_config.yaml
```

#### 3. Add pass/fail criteria
You can specify the pass/fail criteria for the test cases.
Read more about it [here](https://gettaurus.org/docs/PassFail/)

Relevant test yaml section:
```yaml
reporting:
- module: passfail
criteria:
- class: bzt.modules.monitoring.MonitoringCriteria
subject: mms-inference-server/sum_num_handles
condition: '>'
threshold: 180
timeframe: 1s
fail: true
stop: true

```

Test yamls can be found [here](tests/inference_server_monitoring_criteria.yaml) and [here](tests/inference_taurus_local_monitoring_criteria.yaml).
Use command below to run the test case

```bash
export MMS_HOME=<MMS_HOME_PATH>
cd $MMS_HOME/tests/performance
bzt inference_server_monitoring_criteria.yaml tests/common/global_config.yaml
bzt inference_taurus_local_monitoring_criteria.yaml tests/common/global_config.yaml
```


## Work in Progress
1. Add more metrics for cpu and gpu both. Add documentation around those.
2. Add hooks to add custom metrics. Add a metrics registry.
3. Better reporting and artifact management
4. Enhance framework to add better abstraction to hide Taurus and other scripts.
5. Auto threshold calculation, environment profiles
6. Comparison between runs and environments
15 changes: 15 additions & 0 deletions tests/performance/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env python3

# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
Perfomance Metrics Monitoring suite
"""
6 changes: 6 additions & 0 deletions tests/performance/config.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[server]
pid_file = model_server.pid

[monitoring]
HOST =
PORT = 9009
26 changes: 26 additions & 0 deletions tests/performance/configuration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env python3

# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
Read configuration file
"""
# pylint: disable=redefined-builtin
import configparser

config = configparser.ConfigParser()
config.read('./config.ini')

def get(section, key, default=''):
try:
return config[section][key]
except:
return default
Binary file added tests/performance/kitten.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
55 changes: 55 additions & 0 deletions tests/performance/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/usr/bin/env python3

# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
Custom metrics
"""
# pylint: disable=redefined-builtin

AVAILABLE_METRICS = ["sum_cpu_percent",
"sum_memory_percent",
"sum_num_handles",
"server_workers"]


def get_metrics(server_process, child_processes):
""" Get Server processes specific metrics
"""

# TODO - make this modular may be a diff function for each metric
# TODO - allow users to add new metrics easily
# TODO - make sure available metric list is maintained

sum_cpu_percent = 0
sum_memory_percent = 0
sum_num_handles = 0
server_workers = 0
metrics = {}
for process in [server_process] + child_processes:
try:
process.cpu_percent() # to warm-up
except:
pass
else:
cpu_percent = process.cpu_percent()
memory_percent = process.memory_percent()
sum_cpu_percent += cpu_percent
sum_memory_percent += memory_percent
sum_num_handles += process.num_fds()
server_workers += 1

metrics["sum_cpu_percent"] = sum_cpu_percent
metrics["sum_memory_percent"] = sum_memory_percent
metrics["sum_num_handles"] = sum_num_handles
metrics["server_workers"] = server_workers

return metrics
Loading