Conversation
…ing scoring and configuration in IO operations
…uration in IO operations
…ction and refactor PyProphetRunner class to use runner_config properties
…iSupervisedLearner
… main score setting
…ction and refactor PyProphetRunner class to use runner config properties
…on in decoy scores
…ing in IO-related files
… functions and io.util module
…ng and code organization
…fig for reading and deprecate swath_pretrained main score option
… single dispatcher for delegating
…--split_runs option
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the codebase by introducing dataclass-based I/O configurations, centralizing file reading/writing logic, and enhancing the CLI and documentation.
- Introduce
BaseIOConfigdataclass and remove legacyclassifiers.py - Update dependencies and Docker/CI to support new I/O and diagnostics
- Add comprehensive Sphinx-based API/CLI/user-guide documentation and fixtures for tests
Reviewed Changes
Copilot reviewed 217 out of 217 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pyprophet/classifiers.py | Removed obsolete classifier implementations |
| pyprophet/_base.py | Added BaseIOConfig dataclass for unified I/O config |
| pyproject.toml | Added new dependencies (loguru, seaborn, etc.) |
| docs/user_guide/file_conversion.rst | New user guide for file conversion workflows |
Comments suppressed due to low confidence (3)
docs/user_guide/file_conversion.rst:8
- Correct the typo 'acheived' to 'achieved'.
… This can be acheived using with pyprophet using the following command:
docs/cli.rst:141
- Fix the typo 'snigle' to 'single'.
… convert the entire *.osw* file to a snigle parquet file …
pyprophet/_base.py:79
- Consider replacing
os.path.splitextwithpathlib.Path.with_suffixor similar to robustly handle directories and files.
] # TODO: use pathlib instead to avoid potential cases where the outfile is a directory …
| :nested: full | ||
|
|
||
|
|
||
| The :program:`score` command has several advanced options that can be seen using the :option:`--helphelp` flag. |
There was a problem hiding this comment.
Replace the repeated '--helphelp' with the correct '--help' flag.
| The :program:`score` command has several advanced options that can be seen using the :option:`--helphelp` flag. | |
| The :program:`score` command has several advanced options that can be seen using the :option:`--help` flag. |
| f" - is_tsv_file: {is_tsv_file(infile)}" | ||
| ) | ||
|
|
||
| sys.exit(1) |
There was a problem hiding this comment.
Exiting the process from a library can be unexpected; consider raising a descriptive exception instead of calling sys.exit.
| sys.exit(1) | |
| raise FileTypeInferenceError( | |
| f"Failed to infer file type for: {infile}. Supported formats are: " | |
| ".osw, .sqmass/.sqMass, .parquet, split parquet directories (.oswpq/oswpqd), or .tsv files.\n" | |
| f" - is_sqlite_file: {is_sqlite_file(infile)}\n" | |
| f" - endswith .osw: {infile.endswith('.osw')}\n" | |
| f" - endswith .sqmass/.sqMass: {infile.endswith('.sqmass') or infile.endswith('.sqMass')}\n" | |
| f" - is_parquet_file: {is_parquet_file(infile)}\n" | |
| f" - is_valid_single_split_parquet_dir: {is_valid_single_split_parquet_dir(infile)}\n" | |
| f" - is_valid_multi_split_parquet_dir: {is_valid_multi_split_parquet_dir(infile)}\n" | |
| f" - is_tsv_file: {is_tsv_file(infile)}" | |
| ) |
| # Dependencies required for runtime | ||
| dependencies = [ | ||
| "Click", | ||
| "loguru", |
There was a problem hiding this comment.
Pin dependency versions (e.g., loguru >= 0.x) to ensure reproducible builds.
| "loguru", | |
| "loguru >= 0.6.0", |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…te CLI documentation for merging files
…refactor/codebase
This PR is pretty large, but introduces a few updates, mostly in refactoring the code base.
Refactored codebase to use dataclass configuration classes (ScoringIOConfig, IPFIOConfig, LevelContextIOConfig) for managing algorithm parameters and simplifying parameter passing across modules.
Centralized file I/O logic by introducing a unified io.dispatcher module to select the appropriate reader/writer backend based on config.
Added support for both .osw and .parquet formats throughout the PyProphet, including export, read, and scoring, ipf and level context steps.
Improved CLI structure using click.Group and shared decorators for consistent global flags and FDR parameters across subcommands.
Enabled grouped CLI subcommands (e.g., pyprophet levels-context peptide) for cleaner organization of related commands.
Integrated automatic API and CLI documentation.
Added internal diagnostics for row mismatches during score application to aid debugging.
Updated tests to use fixtures for easier re-use
Documentation Enhancements
README.mdwith instructions for building the documentation locally using Sphinx. (docs/README.md, docs/README.mdR1-R8)docs/api/config.rst, [1];docs/api/io.rst, [2];docs/api/index.rst, [3]pyprophetCLI commands, including subcommands for scoring, IPF inference, levels context, and export utilities, with detailed descriptions and usage examples. (docs/cli.rst, docs/cli.rstR1-R156)Sphinx Configuration
conf.pyfile for Sphinx, enabling automated documentation generation with support for extensions likesphinx.ext.autodoc,sphinx_click, andsphinx_rtd_theme. (docs/conf.py, docs/conf.pyR1-R145)