-
Notifications
You must be signed in to change notification settings - Fork 377
[Nexthop] Simplify run_test.py arguments #898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
anna-nexthop
wants to merge
2
commits into
facebook:main
Choose a base branch
from
nexthop-ai:anna-nexthop.run_test-simplify
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
[Nexthop] Simplify run_test.py arguments #898
anna-nexthop
wants to merge
2
commits into
facebook:main
from
nexthop-ai:anna-nexthop.run_test-simplify
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
**Pre-submission checklist**
- [x] I've ran the linters locally and fixed lint errors related to the
files I modified in this PR. You can install the linters by running `pip
install -r requirements-dev.txt && pre-commit install`
- [x] `pre-commit run`
Add support for running FBOSS benchmark test binaries as test suites
using the existing `run_test.py` script. Benchmark tests are performance
measurement binaries that measure metrics like throughput, latency, and
speed of various FBOSS operations.
Also includes documentation updates on how to run benchmarks via the
`run_test.py` script
Users can now easily run benchmark suites instead of individual
binaries that outputs automated CSV output with detailed metrics for
analysis.
- Added `BenchmarkTestRunner` as a **standalone class** (does not extend
`TestRunner`) for straightforward execution
- Added "benchmark" subcommand to `run_test.py`
- All benchmark-related configuration constants are scoped to the class
(not global variables) for easier maintenance
- Created benchmark suite configuration files in
`fboss/oss/hw_benchmark_tests/`:
- `t1_benchmarks.conf` - T1 agent benchmark suite (9 total)
- `t2_benchmarks.conf` - T2 agent benchmark suite (13 total)
- `additional_benchmarks.conf` - Remaining benchmarks (15 total)
- Configuration files are packaged to `./share/hw_benchmark_tests/`
following the same pattern as other test configs
- Parses benchmark output to extract performance metrics:
- Benchmark test name
- Relative time per iteration
- Iterations per second
- CPU time (microseconds)
- Maximum RSS memory usage
- Generates timestamped CSV files with results for tracking and analysis
- Three status values:
- **OK**: Benchmark completed successfully with full metrics
- **FAILED**: Benchmark failed or produced incomplete output
- **TIMEOUT**: Benchmark exceeded timeout limit (default: 1200 seconds)
- Updated `package.py` to include benchmark binaries in
`agent-benchmarks` package
- Updated `package-fboss.py` to copy benchmark configuration files to
`share/hw_benchmark_tests/`
- Added **Option 1: Using the run_test.py Script (Recommended)** section
with examples for:
- Running all benchmarks
- Running T1 benchmark suite
- Running T2 benchmark suite
- Running additional benchmarks
- Kept existing manual execution as **Option 2: Running Individual
Binaries** with the complete list of benchmark binaries
- Added note about CSV output with timestamped filenames
- Updated **T1 Tests → Agent Benchmark Tests** section to use
`run_test.py` command with reference to `t1_benchmarks.conf`
- Updated **T2 Tests → Agent Benchmark Tests** section to use
`run_test.py` command with reference to `t2_benchmarks.conf`
- Follows the same pattern as other test types (SAI, QSFP, Link tests)
for consistency
- Included using the `file=...` reference for one place to
maintain the list of benchmarks for both documentation and
execution
Verified documentation formatting is correct locally
<img width="777" height="586" alt="image"
src="https://github.com/user-attachments/assets/09e6230b-10db-4f80-9e2b-395886e906eb"
/>
<img width="1912" height="1270" alt="image"
src="https://github.com/user-attachments/assets/bac98ba6-8654-42e8-9585-71dc3e4f2633"
/>
The `BenchmarkTestRunner` is implemented as a simple standalone class:
- **Class-level constants**: `BENCHMARK_CONFIG_DIR`,
`T1_BENCHMARKS_CONF`, `T2_BENCHMARKS_CONF`, `ALL_BENCHMARKS_CONF`
- **Public methods**:
- `add_subcommand_arguments()` - Register command-line arguments
- `run_test(args)` - Main entry point for running benchmarks
- **Private methods**:
- `_parse_benchmark_output()` - Extract metrics from benchmark output
- `_run_benchmark_binary()` - Execute a single benchmark binary
Unlike other test runners that extend the abstract `TestRunner` base
class, `BenchmarkTestRunner` is a standalone class. This design choice
was made because:
- Benchmark tests are standalone binaries, not gtest-based tests
- They don't need warmboot/coldboot variants
- They don't use the standard test filtering mechanisms
- A simpler, more direct implementation is more maintainable
Since not all benchmarks are expected to run on a device (e.g. DNX
vs. XGS broadcom chips), the runner also filters out any binaries that
may not be available. As well, since a vendor needs to explicitly
enable building the benchmark binaries, if no benchmark binaries are
found, then outputs a helpful message.
```bash
python3 -m py_compile fboss/oss/scripts/run_scripts/run_test.py
python3 -m py_compile fboss/oss/scripts/package.py
python3 -m py_compile fboss/oss/scripts/package-fboss.py
```
Added two unit tests using the pytest module for coverage
of expected behavior of the run_test script, initially
started with comprehensive coverage of the benchmark subcommand,
and ensuring there are no overlaps between the benchmark suite lists.
These tests were integrated into CMake and will run with ctest
Comprehensive coverage of BenchmarkTestRunner has 25 testcases
covering:
- Loading from custom filter file
- Handling nonexistent files
- Handling empty files
- Loading from default T1, T2, and additional configs
- Handling missing default configs
- Handling all configs missing
- Successful parsing with all metrics
- Missing JSON metrics
- Missing benchmark line
- Empty output
- Different time units
- Successful execution
- Timeout handling
- Execution failure
- Exception handling
- Execution with config arguments
- Nonexistent filter file handling
- List tests mode
- Full execution with CSV writing
- No existing binaries
- Some missing binaries
- End-to-end workflow from default configs to list tests
Implementation of the unit tests includes these key features:
- All tests use mocking where appropriate to avoid side effects
- Tests verify both success and error cases
- Temporary files are properly cleaned up
- Follows pythonic code style with list comprehensions where appropriate
To run locally (in Docker environment):
```bash
cd fboss/oss/scripts/run_scripts
/usr/bin/python3 -m pytest test_run_test.py -v
```
Verified benchmark subcommand is available
```bash
./run_test.py benchmark --help
Setting fboss environment variables
usage: run_test.py benchmark [-h] [--filter_file FILTER_FILE] [--platform_mapping_override_path [PLATFORM_MAPPING_OVERRIDE_PATH]]
optional arguments:
-h, --help show this help message and exit
--filter_file FILTER_FILE
File containing list of benchmark binaries to run (one per line).
--platform_mapping_override_path [PLATFORM_MAPPING_OVERRIDE_PATH]
A file path to a platform mapping JSON file to be used.
```
Verified configuration files exist and are valid when loaded onto a device
```bash
ls -la fboss/oss/hw_benchmark_tests/*.conf
cat fboss/oss/hw_benchmark_tests/t1_benchmarks.conf
cat fboss/oss/hw_benchmark_tests/t2_benchmarks.conf
cat fboss/oss/hw_benchmark_tests/additional_benchmarks.conf
```
Run all benchmarks (default):
```bash
./bin/run_test.py benchmark
```
Run T1 benchmark suite:
```bash
./bin/run_test.py benchmark --filter_file ./share/hw_benchmark_tests/t1_benchmarks.conf
```
Run T2 benchmark suite:
```bash
./bin/run_test.py benchmark --filter_file ./share/hw_benchmark_tests/t2_benchmarks.conf
```
Run remaining additional benchmarks:
```bash
./bin/run_test.py benchmark --filter_file ./share/hw_benchmark_tests/additional_benchmarks.conf
```
- Configuration files will be packaged to `./share/hw_benchmark_tests/`
when `package-fboss.py` runs
- Benchmark binaries will be packaged to `bin/` directory
Ran the t1 suite via run_test with a benchmark binary that's expected to
hang on xgs chips:
```bash
[root@fboss]# ./bin/run_test.py benchmark --filter_file ./share/hw_benchmark_tests/t1_benchmarks.conf
Setting fboss environment variables
Running benchmark tests...
Running benchmarks from ./share/hw_benchmark_tests/t1_benchmarks.conf
Total benchmarks to run: 9
Running command: sai_tx_slow_path_rate-sai_impl --fruid_filepath=/var/facebook/fboss/fruid.json --enable_sai_log WARN --logging DBG4
...
================================================================================
BENCHMARK RESULTS SUMMARY
================================================================================
sai_tx_slow_path_rate-sai_impl: OK
sai_rx_slow_path_rate-sai_impl: OK
sai_ecmp_shrink_speed-sai_impl: OK
sai_rib_resolution_speed-sai_impl: OK
sai_ecmp_shrink_with_competing_route_updates_speed-sai_impl: OK
sai_fsw_scale_route_add_speed-sai_impl: OK
sai_stats_collection_speed-sai_impl: FAILED
sai_init_and_exit_100Gx100G-sai_impl: OK
sai_switch_reachability_change_speed-sai_impl: TIMEOUT
================================================================================
Total: 9 benchmarks
OK: 7
Failed: 1
Timed Out: 1
```
Verified the csv file has sane information:
```bash
[root@fboss]# cat benchmark_results_20260115_191249.csv
benchmark_binary_name,benchmark_test_name,test_status,relative_time_per_iter,iters_per_sec,cpu_time_usec,max_rss
sai_tx_slow_path_rate-sai_impl,runTxSlowPathBenchmark,OK,52.12s,19.19m,126299302,1620652
sai_rx_slow_path_rate-sai_impl,RxSlowPathBenchmark,OK,32.53s,30.74m,42003121,1550232
sai_ecmp_shrink_speed-sai_impl,HwEcmpGroupShrink,OK,7.08s,141.30m,22483771,1607068
sai_rib_resolution_speed-sai_impl,RibResolutionBenchmark,OK,2.11s,474.92m,22031184,1826796
sai_ecmp_shrink_with_competing_route_updates_speed-sai_impl,HwEcmpGroupShrinkWithCompetingRouteUpdates,OK,7.23s,138.26m,23599807,1753464
sai_fsw_scale_route_add_speed-sai_impl,HwFswScaleRouteAddBenchmark,OK,1.35s,743.22m,24730127,1937892
sai_stats_collection_speed-sai_impl,,FAILED,,,,
sai_init_and_exit_100Gx100G-sai_impl,HwInitAndExit100Gx100GBenchmark,OK,16.07s,62.22m,31929920,2162992
sai_switch_reachability_change_speed-sai_impl,,TIMEOUT,,,,
```
## Summary Simplify the command-line interface for run_test.py by removing redundant options and consolidating related functionality. ## Changes ### Removed Options - Remove --sai-bin and --hw-agent-bin-path option (install WIP binaries in the absolute path instead) - Remove --asic option (consolidated into --enable-production-features) ### Updated Options - Consolidate --enable-production-features to take ASIC as argument instead of being a boolean flag - Update all test binaries to use absolute paths (/opt/fboss/bin/*) ### Bug Fixes - Fix typo: producition_features -> production_features ### Documentation - Update documentation with simplified command examples - Add test configuration variables documentation - Simplify T0/T1/T2 test examples in documentation ## Testing ### Integration Test Results **Device:** fboss101 **Build ID:** 23679 **Test Suite:** 11 targeted integration tests **Results:** ✅ **11/11 PASSED (100%)** <details> <summary>Click to expand full test results</summary> ``` ========================================= Starting Integration Tests for PR facebook#408 DUT: fboss101 Results Directory: /tmp/pr408_integration_results_20260203_155518 ========================================= ======================================== Test 1: Verify --enable-production-features takes ASIC argument Command: ssh fboss101 -c 'cd /opt/fboss && ./bin/run_test.py sai_agent --help 2>&1 | grep -A3 "enable-production-features"' ======================================== ✓ PASSED Output: [--enable-production-features ASIC] [--platform_mapping_override_path [PLATFORM_MAPPING_OVERRIDE_PATH]] [--agent-run-mode [{mono,multi_switch}]] [--num-npus {1,2}] -- --enable-production-features ASIC Enable filtering by production features for the specified ASIC --platform_mapping_override_path [PLATFORM_MAPPING_OVERRIDE_PATH] ======================================== Test 2: Test --enable-production-features with valid ASIC (tomahawk5) Command: ssh fboss101 -c 'cd /opt/fboss && timeout 30 ./bin/run_test.py sai_agent --list_tests --config ./share/hw_test_configs/meru400biu.agent.materialized_JSON --enable-production-features tomahawk5 2>&1 | tail -10' ======================================== ✓ PASSED Output: VerifyPortRateTraffic/1 # GetParam() = 100000 VerifyPortRateTraffic/2 # GetParam() = 200000 VerifyPortRateTraffic/3 # GetParam() = 400000 test/AgentHwPtpTcProvisionTests. VerifyPtpTcDelayRequest/TWENTYFIVEG # GetParam() = 25000 VerifyPtpTcDelayRequest/FIFTYG # GetParam() = 50000 VerifyPtpTcDelayRequest/HUNDREDG # GetParam() = 100000 VerifyPtpTcDelayRequest/TWOHUNDREDG # GetParam() = 200000 VerifyPtpTcDelayRequest/FOURHUNDREDG # GetParam() = 400000 ======================================== Test 3: Test --enable-production-features with invalid ASIC shows error Command: ssh fboss101 -c 'cd /opt/fboss && timeout 10 ./bin/run_test.py sai_agent --list_tests --config ./share/hw_test_configs/meru400biu.agent.materialized_JSON --enable-production-features invalid_asic_xyz 2>&1 || true' | grep "Error: ASIC" ======================================== ✓ PASSED Output: Error: ASIC 'invalid_asic_xyz' not found in production features file. ======================================== Test 4: Verify --asic option is removed Command: ssh fboss101 -c 'cd /opt/fboss && ./bin/run_test.py sai_agent --help 2>&1' | grep -c "\-\-asic" ======================================== ✓ PASSED Output: 0 ======================================== Test 5: Verify --sai-bin option is removed from global options Command: ssh fboss101 -c 'cd /opt/fboss && ./bin/run_test.py --help 2>&1' | grep -c "\-\-sai-bin" ======================================== ✓ PASSED Output: 0 ======================================== Test 6: Verify --hw-agent-bin-path option is removed Command: ssh fboss101 -c 'cd /opt/fboss && ./bin/run_test.py sai_agent --help 2>&1' | grep -c "\-\-hw-agent-bin-path" ======================================== ✓ PASSED Output: 0 ======================================== Test 7: Verify run_test.py references absolute paths for test binaries Command: ssh fboss101 -c 'sudo cat /opt/fboss/bin/run_test.py | grep -c "/opt/fboss/bin/sai_test-sai_impl"' ======================================== ✓ PASSED Output: 1 ======================================== Test 8: Verify typo fix: production_features (not producition_features) Command: ssh fboss101 -c 'sudo cat /opt/fboss/bin/run_test.py' | grep -c "producition_features" ======================================== ✓ PASSED Output: 0 ======================================== Test 9: Verify run_test.py has valid Python syntax Command: ssh fboss101 -c 'sudo cp /opt/fboss/bin/run_test.py /tmp/run_test_syntax_check.py && python3 -m py_compile /tmp/run_test_syntax_check.py && echo "Syntax check passed" && sudo rm -f /tmp/run_test_syntax_check.py*' ======================================== ✓ PASSED Output: Syntax check passed ======================================== Test 10: Verify run_test.py can be imported as module Command: ssh fboss101 -c 'cd /opt/fboss/bin && python3 -c "import sys; sys.path.insert(0, \".\"); import run_test" && echo "Import successful"' ======================================== ✓ PASSED Output: Import successful ======================================== Test 11: Verify --production-features option removed from qsfp subcommand Command: ssh fboss101 -c 'cd /opt/fboss && ./bin/run_test.py qsfp --help 2>&1' | grep -c "\-\-production-features" ======================================== ✓ PASSED Output: 0 ========================================= Test Summary ========================================= Total Tests: 11 Passed: 11 Failed: 0 ========================================= Results saved to: /tmp/pr408_integration_results_20260203_155518 All tests passed! ``` </details> # # Date: Tue Feb 3 12:16:27 2026 -0800 # # On branch anna-nexthop.run_test-simplify # Changes to be committed: # modified: docs/docs/testing/test_categories.md # modified: fboss/oss/scripts/run_scripts/run_test.py #
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pre-submission checklist
pip install -r requirements-dev.txt && pre-commit installpre-commit runSummary
Simplify the command-line interface for run_test.py by
removing redundant options and consolidating related functionality.
PR is dependent and based off of https://github.com/facebook/fboss/pull/895/changes since it references the benchmark subcommand in this change. The commit to review is the latest at 450c827
Changes
Removed Options
Updated Options
instead of being a boolean flag
Bug Fixes
Documentation
Testing
Integration Test Results
Device: fboss101
Build ID: 23679
Test Suite: 11 targeted integration tests
Results: ✅ 11/11 PASSED (100%)
Click to expand full test results