Skip to content

fix(cli): validate required args before running --diagnostic#741

Open
oboehmer wants to merge 16 commits intomainfrom
fix/664-zip-diagnostic
Open

fix(cli): validate required args before running --diagnostic#741
oboehmer wants to merge 16 commits intomainfrom
fix/664-zip-diagnostic

Conversation

@oboehmer
Copy link
Copy Markdown
Collaborator

@oboehmer oboehmer commented Apr 4, 2026

Description

This PR improves --diagnostic behavior and the bundled diagnostic collection script to make diagnostics safer, more accurate, and easier to consume.

CLI --diagnostic behavior (Fixes #740)

  • Move --diagnostic handling out of the Typer callback and into nac_test/cli/main.py so required args are validated by Typer/Click before diagnostics run.
  • Introduce run_diagnostic(...) (annotated NoReturn) that runs the diagnostic script and exits with the script’s return code.
  • Reject Windows early for --diagnostic.

Diagnostic script improvements (Fixes #664 + follow-ups)

  • Fail fast with a clear message when zip command is missing (observed on WSL2 environment)
  • Fix wrapped command exit code reporting (avoid incorrect PIPESTATUS usage).
  • Create zip archives with relative paths (avoid embedding full /var/... prefixes in the archive contents).
  • Use pyats version check instead of pyats.__version__ for reliable version capture.
  • Expect the wrapped nac-test command as a single quoted argument ($1) and enforce the contract.
  • Update Robot Framework artifact collection to use $OUTPUT_DIR/robot_results, and collect robot outputs via a single find pass.
  • Clarify log messaging around where nac-test output is captured.

Tests

  • Update tests/unit/cli/test_diagnostic.py to match the new run_diagnostic entrypoint.
  • Tighten _reconstruct_command coverage with strict shlex.join/shlex.split round-trip assertions.
  • Add fixtures for a canonical argv including required options and a variant containing spaces.
  • Silence a mypy false-positive around NoReturn in pytest.raises.

Closes

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature
  • Breaking change
  • Refactoring / Technical debt
  • Documentation update
  • Chore
  • Other (please describe):

Test Framework Affected

  • PyATS
  • Robot Framework
  • Both
  • N/A (not test-framework specific)

NaC Architecture Affected

  • N/A (architecture-agnostic)

Platform Tested

  • macOS (local dev)
  • Linux (distro/version tested: )

Key Changes

  • Validate required CLI args before running diagnostics (--diagnostic no longer bypasses missing options).
  • Ensure diagnostic summary reports the wrapped command’s real exit code.
  • Improve diagnostic archive portability (relative zip paths) and robustness (zip preflight, stricter command arg handling, PyATS version capture).
  • Adjust artifact collection paths for Robot Framework.

Testing Done

  • Unit tests added/updated
  • Integration tests performed
  • Manual testing performed:
    • PyATS tests executed successfully
    • Robot Framework tests executed successfully
    • HTML reports generated correctly
  • All existing tests pass (pytest / pre-commit run -a)

Test Commands Used

python -m pytest -q
pre-commit run -a

Additional Notes

  • Diagnostic command is passed to the script as a single shell-safe string (from shlex.join), and the script enforces that it receives exactly one command argument.
  • Robot outputs are now expected under $OUTPUT_DIR/robot_results.

oboehmer added 10 commits April 3, 2026 22:15
- Add early `zip` preflight in diagnostic collection script (with install hints)
- Hard-error on Windows when `--diagnostic` is used (exit code 2)
- Clarify `--diagnostic` help text as Linux/macOS-only
- Add unit test to ensure Windows gating avoids running subprocess
Handle --diagnostic inside main() after Typer/Click parsing so missing required
options (-d/-t) fail normally. Run the diagnostic script via run_diagnostic()
and exit with its return code; hard-error on Windows.
Add targeted `type: ignore[unreachable]` on assertions after calling
run_diagnostic (annotated NoReturn) inside pytest.raises blocks, so the
pre-commit mypy hook passes.
Introduce base_argv/base_argv_with_spaces fixtures and tighten reconstruct
tests to assert shlex round-trips exactly. Reuse base_argv in TestRunDiagnostic
to avoid ad-hoc argv construction.
Create the diagnostic zip from the parent directory so the archive contains
only the diagnostic folder (no full /var/... path prefixes).
Require the nac-test command to be passed as a single quoted argument ($1)
and fail fast otherwise. Also clarify the log message that nac-test output is
not shown on screen and is written to 100_nac_test_execution.txt.
Compute Robot Framework output file list once from $OUTPUT_DIR/robot_results
and reuse it for both logging and copying, avoiding duplicate filesystem scans.
@oboehmer oboehmer marked this pull request as ready for review April 6, 2026 14:20
@oboehmer oboehmer requested a review from aitestino April 6, 2026 14:21
@oboehmer oboehmer self-assigned this Apr 8, 2026
@oboehmer oboehmer added the bug Something isn't working label Apr 8, 2026
Copy link
Copy Markdown
Collaborator

@aitestino aitestino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @oboehmer, thank you for the PR -- this is a well-motivated fix that addresses a real usability gap. Moving --diagnostic out of the eager Typer callback and into main() so required args get validated first is exactly the right architectural call, and it eliminates ~35 lines of hand-rolled argument parsing (_extract_output_dir) that was duplicating what Typer already does. The $1 enforcement in the shell script, the zip preflight check, the relative archive paths, and the PIPESTATUS[0] bug fix are all genuine improvements. Clean PR overall.

I did a couple of reviews on this, and I think we need some small changes before merge. The core design is sound -- these are all polish items.

Things that need adjustment:

  1. ctx.command_path vs sys.argv[0] in the argv reconstruction -- run_diagnostic(output, argv=[ctx.command_path, *sys.argv[1:]]) uses the bare name nac-test rather than the full path that sys.argv[0] would provide (e.g., /home/user/.venv/bin/nac-test). The diagnostic script re-executes via bash -c, which relies on PATH resolution. In CI environments, Docker containers, or improperly exported virtualenvs, the bare name may not resolve. Since _reconstruct_command already strips --diagnostic from whatever argv it receives, passing sys.argv directly would work and be more robust. That said, ctx.command_path is arguably more portable for the general case -- I'd like to hear your reasoning on this one.

  2. Subprocess call argument assertions dropped from tests -- The previous test_true_with_output_dir_runs_script verified call_args[0] == "bash" and "nac-test-diagnostic.sh" in call_args[1]. The new test_propagates_script_exit_code only checks mock_run.assert_called_once() without inspecting what it was called with. A regression that changes the argument order, drops -o, or passes output_dir as a Path object instead of a string would go undetected. Worth restoring those assertions.

  3. Missing return type annotations on new test methods -- The existing test methods in this file consistently annotate -> None. The new TestRunDiagnostic methods drop them:

    def test_windows_errors_without_running_script(self, base_argv):  # missing -> None
    def test_propagates_script_exit_code(self, mock_run, base_argv):  # missing -> None + MagicMock type
  4. run_diagnostic docstring has a multi-line opening sentence -- The summary line wraps across two lines. The convention elsewhere in the file is a single-line summary followed by a blank line:

    """Run the diagnostic collection shell script.
    
    Wraps nac-test execution with pre/post environment capture and log collection.
  5. _reconstruct_command docstring stripped -- The full docstring with Args/Returns sections was replaced with a bare one-liner. Looking at _find_diagnostic_script right above it, which has the full treatment, restoring at least Args and Returns would keep it consistent.

  6. Missing class docstring on TestRunDiagnostic -- TestReconstructCommand and TestFindDiagnosticScript both have class docstrings. Adding """Tests for run_diagnostic function.""" would keep it consistent.

I'd consider items 1-3 as things to address before merge -- they're all quick fixes. Items 4-6 are docstring polish and could be follow-ups if you'd rather not hold this up further.

What do you think?

P.S. -- This comment was drafted using voice-to-text via Claude Code. If the tone comes across as overly direct or terse, please know that's just how it tends to phrase things. No offense or criticism is intended -- this is purely an objective technical review of the PR. Thanks for understanding!

Use sys.argv directly instead of [ctx.command_path, *sys.argv[1:]] so the
diagnostic script receives the full invocation path (e.g.
/home/user/.venv/bin/nac-test) rather than the bare command name. This is
more robust for bash -c re-execution in virtualenvs and CI environments.
Verify that subprocess.run receives the correct argument structure:
bash, script path, -o flag, and output directory. Prevents regressions
that change argument order or drop required flags from going undetected.
Reformat the opening sentence to a single-line summary followed by a
blank line and extended description, matching the file's convention
(see _find_diagnostic_script).
Add Args and Returns sections to match the docstring style of
_find_diagnostic_script and other functions in the module.
Consistent with TestReconstructCommand and TestFindDiagnosticScript
which both have class-level docstrings.
@oboehmer
Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough review! Addressed in individual commits:

1. ctx.command_path vs sys.argv[0] — Agreed, sys.argv is more robust since it preserves the full invocation path for bash -c re-execution. Simplified to argv=sys.argv. → b0c81ce

2. Subprocess call assertions — Restored assertions verifying bash, script path, -o, and output dir in the correct order. → 34cae5d

3. Missing -> None annotations — I believe this is a false alarm — both test_windows_errors_without_running_script and test_propagates_script_exit_code already have -> None return annotations, and mock_run is typed as MagicMock. Happy to double-check if you're seeing something different.

4. Multi-line docstring opening — Fixed to single-line summary + blank line + details, matching the file's convention. → d1d32cb

5. _reconstruct_command docstring — Restored Args/Returns sections for consistency with _find_diagnostic_script. → 0baf236

6. TestRunDiagnostic class docstring — Added, consistent with the other test classes. → dcba2e8

@oboehmer oboehmer requested a review from aitestino April 14, 2026 10:33
@oboehmer oboehmer added the re-review re-review requested label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working re-review re-review requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: --diagnostic bypasses required args and reports wrong exit code [bug] v 2.0.0a2 - Diagnostic collection fails

2 participants