Skip to content

Comments

Refactor/codebase#142

Merged
singjc merged 218 commits intoPyProphet:masterfrom
singjc:refactor/codebase
Jun 3, 2025
Merged

Refactor/codebase#142
singjc merged 218 commits intoPyProphet:masterfrom
singjc:refactor/codebase

Conversation

@singjc
Copy link
Contributor

@singjc singjc commented Jun 1, 2025

This PR is pretty large, but introduces a few updates, mostly in refactoring the code base.

  • Refactored codebase to use dataclass configuration classes (ScoringIOConfig, IPFIOConfig, LevelContextIOConfig) for managing algorithm parameters and simplifying parameter passing across modules.

  • Centralized file I/O logic by introducing a unified io.dispatcher module to select the appropriate reader/writer backend based on config.

  • Added support for both .osw and .parquet formats throughout the PyProphet, including export, read, and scoring, ipf and level context steps.

  • Improved CLI structure using click.Group and shared decorators for consistent global flags and FDR parameters across subcommands.

  • Enabled grouped CLI subcommands (e.g., pyprophet levels-context peptide) for cleaner organization of related commands.

  • Integrated automatic API and CLI documentation.

  • Added internal diagnostics for row mismatches during score application to aid debugging.

  • Updated tests to use fixtures for easier re-use

Documentation Enhancements

  • General Documentation:
    • Added a new section in README.md with instructions for building the documentation locally using Sphinx. (docs/README.md, docs/README.mdR1-R8)
  • API References:
    • Introduced automated API documentation for configuration, IO, CLI commands, and export utilities, including structured sections for scoring, IPF, levels context, and export configurations. (docs/api/config.rst, [1]; docs/api/io.rst, [2]; docs/api/index.rst, [3]
  • CLI Documentation:
    • Documented the pyprophet CLI commands, including subcommands for scoring, IPF inference, levels context, and export utilities, with detailed descriptions and usage examples. (docs/cli.rst, docs/cli.rstR1-R156)

Sphinx Configuration

  • Added a new conf.py file for Sphinx, enabling automated documentation generation with support for extensions like sphinx.ext.autodoc, sphinx_click, and sphinx_rtd_theme. (docs/conf.py, docs/conf.pyR1-R145)

singjc added 30 commits May 19, 2025 02:01
…ing scoring and configuration in IO operations
…ction and refactor PyProphetRunner class to use runner_config properties
…ction and refactor PyProphetRunner class to use runner config properties
…fig for reading and deprecate swath_pretrained main score option
@singjc singjc requested a review from Copilot June 3, 2025 19:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the codebase by introducing dataclass-based I/O configurations, centralizing file reading/writing logic, and enhancing the CLI and documentation.

  • Introduce BaseIOConfig dataclass and remove legacy classifiers.py
  • Update dependencies and Docker/CI to support new I/O and diagnostics
  • Add comprehensive Sphinx-based API/CLI/user-guide documentation and fixtures for tests

Reviewed Changes

Copilot reviewed 217 out of 217 changed files in this pull request and generated 4 comments.

File Description
pyprophet/classifiers.py Removed obsolete classifier implementations
pyprophet/_base.py Added BaseIOConfig dataclass for unified I/O config
pyproject.toml Added new dependencies (loguru, seaborn, etc.)
docs/user_guide/file_conversion.rst New user guide for file conversion workflows
Comments suppressed due to low confidence (3)

docs/user_guide/file_conversion.rst:8

  • Correct the typo 'acheived' to 'achieved'.
… This can be acheived using with pyprophet using the following command:

docs/cli.rst:141

  • Fix the typo 'snigle' to 'single'.
… convert the entire *.osw* file to a snigle parquet file …

pyprophet/_base.py:79

  • Consider replacing os.path.splitext with pathlib.Path.with_suffix or similar to robustly handle directories and files.
]  # TODO: use pathlib instead to avoid potential cases where the outfile is a directory …

:nested: full


The :program:`score` command has several advanced options that can be seen using the :option:`--helphelp` flag.
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the repeated '--helphelp' with the correct '--help' flag.

Suggested change
The :program:`score` command has several advanced options that can be seen using the :option:`--helphelp` flag.
The :program:`score` command has several advanced options that can be seen using the :option:`--help` flag.

Copilot uses AI. Check for mistakes.
f" - is_tsv_file: {is_tsv_file(infile)}"
)

sys.exit(1)
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exiting the process from a library can be unexpected; consider raising a descriptive exception instead of calling sys.exit.

Suggested change
sys.exit(1)
raise FileTypeInferenceError(
f"Failed to infer file type for: {infile}. Supported formats are: "
".osw, .sqmass/.sqMass, .parquet, split parquet directories (.oswpq/oswpqd), or .tsv files.\n"
f" - is_sqlite_file: {is_sqlite_file(infile)}\n"
f" - endswith .osw: {infile.endswith('.osw')}\n"
f" - endswith .sqmass/.sqMass: {infile.endswith('.sqmass') or infile.endswith('.sqMass')}\n"
f" - is_parquet_file: {is_parquet_file(infile)}\n"
f" - is_valid_single_split_parquet_dir: {is_valid_single_split_parquet_dir(infile)}\n"
f" - is_valid_multi_split_parquet_dir: {is_valid_multi_split_parquet_dir(infile)}\n"
f" - is_tsv_file: {is_tsv_file(infile)}"
)

Copilot uses AI. Check for mistakes.
# Dependencies required for runtime
dependencies = [
"Click",
"loguru",
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin dependency versions (e.g., loguru >= 0.x) to ensure reproducible builds.

Suggested change
"loguru",
"loguru >= 0.6.0",

Copilot uses AI. Check for mistakes.
@singjc singjc marked this pull request as ready for review June 3, 2025 22:49
@singjc singjc merged commit 87bbbc0 into PyProphet:master Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant