feat: python client; resolves #44 #56

jdpearce4 · 2025-08-25T23:20:54Z

Summary

Introduce a first-class Python client (TranscriptFormerClient) for inference and artifact/data downloads. The client mirrors the CLI configuration behavior while providing a simple, programmatic API that returns an in-memory AnnData object.

Key Features

In-memory inference: inference(...) returns an anndata.AnnData without writing to disk.
Config parity with CLI: Builds a Hydra-compatible config by:
Loading the same CLI YAML defaults.
Applying dataclass overrides from kwargs (InferenceConfig, DataConfig).
Merging with the checkpoint config.json via the same utility used by the CLI.

Convenience downloads:

download_model(...) for checkpoints and embeddings.
download_data(...) for CellxGene datasets by species.
download_dataset(...) for curated sources (e.g., Tabula Sapiens, Bgee).
Logging control: Optional log_level argument to run quietly or verbosely without affecting global logging.

API Surface

TranscriptFormerClient.inference(data_file, checkpoint_path, **kwargs) -> anndata.AnnData
Accepts most InferenceConfig and DataConfig fields as kwargs (e.g., batch_size, output_keys, gene_col_name, use_raw, use_oom_dataloader, n_data_workers, etc.).
Returns a single AnnData with obsm/uns populated per output_keys.
TranscriptFormerClient.download_model(model, checkpoint_dir=...) -> None
TranscriptFormerClient.download_data(species=[...], output_dir=..., ...) -> int
TranscriptFormerClient.download_dataset(dataset, ...) -> anndata.AnnData | None

…ce and artifact downloading - Deleted `download_artifacts.py` and `inference.py` scripts as they are now replaced by CLI commands. - Updated CLI commands to improve user experience and added progress tracking for downloads and extractions. - Enhanced inference configuration to support backward compatibility for checkpoint paths. - Updated documentation in the inference configuration YAML file to clarify model types and embedding options.

Copilot

Pull Request Overview

This PR introduces a first-class Python client (TranscriptFormerClient) that enables programmatic inference and downloads for TranscriptFormer models. The client provides in-memory inference operations that return AnnData objects directly, mirroring CLI configuration behavior while offering a simplified API.

Key Changes

Adds Python client with inference(), download_model(), download_data(), and download_dataset() methods
Refactors config merging logic into reusable utility for consistency between CLI and client
Enhances inference configuration with new checkpoint_path and model_type fields

Reviewed Changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/transcriptformer/client/client.py	Core client implementation with inference and download methods
src/transcriptformer/config/build_config.py	Shared config merging utility extracted from CLI
src/transcriptformer/data/dataclasses.py	Added checkpoint_path and model_type fields to InferenceConfig
src/transcriptformer/cli/inference.py	Refactored to use shared config merging utility
src/transcriptformer/cli/download_artifacts.py	Performance improvements to progress tracking
src/transcriptformer/init.py	Package-level client exports
README.md	Documentation for Python client usage

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/transcriptformer/client/client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

jdpearce4 and others added 9 commits July 22, 2025 17:21

Merge branch 'main' into jpearce-python-client

874dc0b

update

a26a055

Merge branch 'main' into jpearce-python-client

264cd0d

update

95394d1

formating

38f4898

updated README

e5cd2b1

Merge branch 'main' into jpearce-python-client

8eb9e89

chore: normalize file modes (remove exec bit)

5c8e3f1

jdpearce4 requested a review from Copilot August 25, 2025 23:20

Copilot AI reviewed Aug 25, 2025

View reviewed changes

src/transcriptformer/client/client.py Outdated Show resolved Hide resolved

src/transcriptformer/client/client.py Outdated Show resolved Hide resolved

src/transcriptformer/client/client.py Outdated Show resolved Hide resolved

jdpearce4 changed the title ~~feat: python client; resolves #45~~ feat: python client; resolves #44 Aug 25, 2025

jdpearce4 and others added 3 commits August 26, 2025 13:05

Update src/transcriptformer/client/client.py

8ecc70a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

removed deplicate config creation and options

685ac91

move progress tracker and merge conf

2b58fdc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: python client; resolves #44 #56

feat: python client; resolves #44 #56

Uh oh!

jdpearce4 commented Aug 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: python client; resolves #44 #56

Are you sure you want to change the base?

feat: python client; resolves #44 #56

Uh oh!

Conversation

jdpearce4 commented Aug 25, 2025

Summary

Key Features

Convenience downloads:

API Surface

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants