Skip to content

Conversation

@llbbl
Copy link

@llbbl llbbl commented Jun 26, 2025

Set up Python Testing Infrastructure

Summary

This PR establishes a comprehensive testing infrastructure for the Grover project using Poetry as the package manager and pytest as the testing framework. The setup provides a ready-to-use testing environment where developers can immediately start writing tests.

Changes Made

Package Management

  • Configured Poetry as the primary package manager via pyproject.toml
  • Migrated existing dependencies from requirements-gpu.txt and requirements-tpu.txt
  • Updated dependency versions for better compatibility

Testing Framework

  • Added pytest, pytest-cov, and pytest-mock as development dependencies
  • Configured pytest with:
    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage reporting with 80% threshold
    • HTML and XML coverage output formats
    • Custom test markers: unit, integration, and slow
    • Strict mode with helpful output formatting

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared fixtures and configuration
├── test_setup_validation.py  # Infrastructure validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Fixtures (in conftest.py)

  • temp_dir: Temporary directory for file operations
  • mock_config: Mock configuration dictionary
  • sample_json_data: Sample JSON test data
  • sample_jsonl_file: Creates temporary JSONL files
  • mock_model_checkpoint: Mock TensorFlow checkpoint structure
  • mock_vocab_files: Mock vocabulary files for tokenization
  • sample_model_config: Model configuration matching project format
  • environment_variables: Common test environment setup
  • cleanup_tensorflow: Automatic TensorFlow resource cleanup

Additional Configuration

  • Updated .gitignore with testing artifacts and Claude settings
  • Excluded poetry.lock from gitignore (should be committed)
  • Added Poetry script commands for running tests

Running Tests

With Poetry (recommended):

# Install dependencies
poetry install

# Run all tests
poetry run test
# or
poetry run tests

# Run with specific markers
poetry run pytest -m unit
poetry run pytest -m integration
poetry run pytest -m "not slow"

Without Poetry:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install test dependencies
pip install pytest pytest-cov pytest-mock

# Run tests
pytest
pytest -v -m unit

Test Output

All validation tests pass successfully:

  • 17 tests validating infrastructure setup
  • All fixtures work correctly
  • Test markers function as expected
  • Coverage reporting generates properly

Notes

  • The project uses TensorFlow 1.x, so dependency versions are constrained accordingly
  • Coverage is configured to monitor lm, sample, and discrimination packages
  • The 80% coverage threshold applies when writing actual unit tests
  • Validation tests verify the infrastructure works but don't count toward coverage

Next Steps

Developers can now:

  1. Write unit tests in tests/unit/
  2. Write integration tests in tests/integration/
  3. Use the provided fixtures for common test scenarios
  4. Run tests with coverage reporting to ensure code quality

- Configure Poetry as package manager with pyproject.toml
- Add pytest, pytest-cov, and pytest-mock as dev dependencies
- Set up pytest configuration with coverage thresholds and custom markers
- Create testing directory structure (tests/, unit/, integration/)
- Add comprehensive conftest.py with reusable fixtures
- Update .gitignore with testing and Poetry entries
- Create validation tests to verify setup functionality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant