Skip to content

Conversation

@tristan-f-r
Copy link
Collaborator

@tristan-f-r tristan-f-r commented Jul 14, 2025

Note

This also contains changes that close #297 to resolve a dependency diamond. I was debating about splitting that to another PR, but that 'containers' PR would depend on #292 (giving this 2 degrees of dependency), and the types in PRA#run actually make that change easier to follow and motivate.

This also borrows (not a direct dependency) from #286. #286 should be merged after #329 is merged, as there's a good chance that we can just auto-generate documentation from the documentation here instead.

Closes #321, closes #296, and closes #297.

Arguments are now specified as a pydantic BaseModel with attached documentation:

class DominoParams(BaseModel):
    module_threshold: Optional[float]
    "the p-value threshold for considering a slice as relevant (optional)"

    slice_threshold: Optional[float]
    "the p-value threshold for considering a putative module as final module (optional)"

    model_config = ConfigDict(use_attribute_docstrings=True)

When constructing a PRM, this is passed in as a generic:

class DOMINO(PRM[DominoParams]):

For algorithms that don't specify parameters, the Empty type is preferred instead (whose signature is the empty BaseModel with an attached model_config that doesn't allow any other parameters.

This also changes the signature of PRA#run (reflecting the PR title) to be:

# Where T is the TypedVar which must be a pydantic.BaseModel
def run(inputs: dict[str, str | os.PathLike], output_file: str | os.PathLike, args: T, container_settings: ProcessedContainerOptions):

This has the disadvantage that inputs no longer has code completion, but this was probably something to not hide from the developer-user anyway, as we were passing inputs in via kwargs. See:

# This PR is more verbose when passing in arguments, which is a problem for people who
# are directly using PRA#run. However, I don't care about this audience.
PathLinker.run({"nodetypes": TEST_DIR+'input/sample-in-nodetypes.txt',
                "network": TEST_DIR+'input/sample-in-net.txt'},
                output_file=OUT_FILE_100,
                args=PathLinkerParams(k=100))

All of this gives us:

  • Encouraged, easily parsable PRM argument documentation
  • Parameter validation
  • a fully-specified JSON schema if feat: json schema #358
  • default factories, which are used to fix nondeterminism (feat: seeds #335).
  • Parameter types, to be used for parameter tuning for automatically determining what parameters to select
    • (and AST-based range parsing for us to easily increment the step size for parameter tuning)

Typed Implementation Details

  1. Since this now uses types, we prefer pathlib throughout.

For reviewers

algorithms.py is the only "dense code" in this PR. Otherwise, most of this is just refactoring all of the PRA#run calls to meet the new signature, specifying the new parameter models, and applying the suggestions in #296 (only in this PR) and #297 (see #387 and #390).

@tristan-f-r tristan-f-r changed the title Config args refactor!: typed PRA#run Jul 14, 2025
@tristan-f-r tristan-f-r changed the title refactor!: typed PRA#run feat!: typed PRA#run Jul 14, 2025
@tristan-f-r tristan-f-r added enhancement New feature or request needed for benchmarking Priority PRs needed for the benchmarking paper labels Jul 14, 2025
@tristan-f-r tristan-f-r changed the title feat!: typed PRA#run feat!: schema & typed PRA#run Jul 15, 2025
@tristan-f-r tristan-f-r added the tuning Workflow-spanning algorithm tuning label Oct 14, 2025
@ntalluri
Copy link
Collaborator

If this gets pushed in, should the documentation for the Contributing a new pathway reconstruction algorithm be updated as well (maybe in a different PR)?

Co-authored-by: Neha Talluri <78840540+ntalluri@users.noreply.github.com>
@github-actions github-actions bot removed the merge-conflict This PR has merge conflicts. label Oct 29, 2025
@tristan-f-r tristan-f-r added P-high This is a blocker for many PRs/issues/features and removed P-medium medium prirotity; this is needed for some external service or another PR labels Nov 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocked-by-other-pr enhancement New feature or request needed for benchmarking Priority PRs needed for the benchmarking paper P-high This is a blocker for many PRs/issues/features tuning Workflow-spanning algorithm tuning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Typed PRA#run [config] nested runs [config] containers

2 participants