Skip to content

Commit

Permalink
Merge pull request #22 from yutanagano/move_functional_api_to_top
Browse files Browse the repository at this point in the history
Move functional api to top
  • Loading branch information
yutanagano authored Jun 8, 2024
2 parents 045f41c + 59a50e6 commit 8037ab4
Show file tree
Hide file tree
Showing 10 changed files with 86 additions and 100 deletions.
2 changes: 1 addition & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ API reference
.. toctree::
:maxdepth: 2

sceptr_sceptr
sceptr
sceptr_variant
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
project = "sceptr"
copyright = "2024, Yuta Nagano"
author = "Yuta Nagano"
version = sceptr.VERSION
version = sceptr.__version__

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ SCEPTR is a BERT-like transformer-based neural network implemented in `Pytorch <
With the default model providing best-in-class performance with only 153,108 parameters (typical protein language models have tens or hundreds of millions), SCEPTR runs fast- even on a CPU!
And if your computer does have a `CUDA-enabled GPU <https://en.wikipedia.org/wiki/CUDA>`_, the ``sceptr`` package will automatically detect and use it, giving you blazingly fast performance without the hassle.

``sceptr``'s :ref:`API <api>` exposes three intuitive functions: :py:func:`~sceptr.sceptr.calc_vector_representations`, :py:func:`~sceptr.sceptr.calc_cdist_matrix`, and :py:func:`~sceptr.sceptr.calc_pdist_vector`-- and it's all you need to make full use of the SCEPTR models.
``sceptr``'s :ref:`API <api>` exposes three intuitive functions: :py:func:`~sceptr.calc_vector_representations`, :py:func:`~sceptr.calc_cdist_matrix`, and :py:func:`~sceptr.calc_pdist_vector`-- and it's all you need to make full use of the SCEPTR models.
What's even better is that they are fully compliant with `pyrepseq <https://pyrepseq.readthedocs.io>`_'s `tcr_metric <https://pyrepseq.readthedocs.io/en/latest/api.html#pyrepseq.metric.tcr_metric.TcrMetric>`_ API, so ``sceptr`` will fit snugly into the rest of your repertoire analysis toolkit.

.. toctree::
Expand Down
7 changes: 7 additions & 0 deletions docs/sceptr.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.. _api:

`sceptr`
===============

.. automodule:: sceptr
:members:
7 changes: 0 additions & 7 deletions docs/sceptr_sceptr.rst

This file was deleted.

21 changes: 3 additions & 18 deletions docs/usage.rst
Original file line number Diff line number Diff line change
@@ -1,31 +1,16 @@
Usage
=====

Functional API (:py:mod:`sceptr.sceptr`)
----------------------------------------
Functional API (Recommended)
----------------------------

.. tip::
To use the functional API, import the `sceptr` submodule like so:

>>> from sceptr import sceptr

Attempting to access the submodule as an attribute of the top level module

>>> import sceptr
>>> # ... load data, etc ...
>>> sceptr.sceptr.calc_vector_representations(df)

will result in an error.

The eponymous :py:mod:`sceptr.sceptr` submodule is the easiest way to use SCEPTR.
It loads the default SCEPTR variant and exposes its methods directly as module-level functions.
The functional API is accessible from the root module, and is the easiest way to use SCEPTR.

Model Variants (:py:mod:`sceptr.variant`)
-----------------------------------------

For more curious users, model variants are available to load and use through the :py:mod:`sceptr.variant` submodule.
The module exposes functions, each named after a particular model variant, which when called, will return a :py:class:`~sceptr.model.Sceptr` object corresponding to the selected model variant.
This :py:class:`~sceptr.model.Sceptr` object will then have the methods: `calc_pdist_vector`, `calc_cdist_matrix`, and `calc_vector_representations` available to use, with function signatures exactly as defined above for the functional API in the :py:mod:`sceptr.sceptr` submodule.

.. _data_format:

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ dev = [
include-package-data = true

[tool.setuptools.dynamic]
version = {attr = "sceptr.VERSION"}
version = {attr = "sceptr.__version__"}

[tool.pytest.ini_options]
filterwarnings = [
Expand Down
74 changes: 71 additions & 3 deletions src/sceptr/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,75 @@
"""
Simple Contrastive Embedding of the Primary sequence of T cell Receptors
========================================================================
SCEPTR is a small, fast, and performant TCR representation model for alignment-free TCR analysis.
The root module provides easy access to SCEPTR through a functional API which uses the default :py:class:`~sceptr.model.Sceptr` model.
"""

VERSION = "1.0.0-beta.1"
from sceptr import variant
from sceptr.model import Sceptr
import sys
from numpy import ndarray
from pandas import DataFrame


__version__ = "1.0.0-beta.1"


def calc_cdist_matrix(anchors: DataFrame, comparisons: DataFrame) -> ndarray:
"""
Generate a cdist matrix between two collections of TCRs.
Parameters
----------
anchors : DataFrame
DataFrame in the :ref:`prescribed format <data_format>`.
comparisons : DataFrame
DataFrame in the :ref:`prescribed format <data_format>`.
Returns
-------
ndarray
A 2D numpy ndarray representing a cdist matrix between TCRs from `anchors` and `comparisons`.
The returned array will have shape (X, Y) where X is the number of TCRs in `anchors` and Y is the number of TCRs in `comparisons`.
"""
return get_default_model().calc_cdist_matrix(anchors, comparisons)


def calc_pdist_vector(instances: DataFrame) -> ndarray:
"""
Generate a pdist set of distances between each pair of TCRs in the input data.
Parameters
----------
instances : DataFrame
DataFrame in the :ref:`prescribed format <data_format>`.
Returns
-------
ndarray
A 2D numpy ndarray representing a pdist vector of distances between each pair of TCRs in `instances`.
The returned array will have shape (1/2 * N * (N-1),), where N is the number of TCRs in `instances`.
"""
return get_default_model().calc_pdist_vector(instances)


def calc_vector_representations(instances: DataFrame) -> ndarray:
"""
Map a table of TCRs provided as a pandas DataFrame in the above format to a set of vector representations.
Parameters
----------
instances : DataFrame
DataFrame in the :ref:`prescribed format <data_format>`.
Returns
-------
ndarray
A 2D numpy ndarray object where every row vector corresponds to a row in `instances`.
The returned array will have shape (N, D) where N is the number of TCRs in `instances` and D is the dimensionality of the SCEPTR model.
"""
return get_default_model().calc_vector_representations(instances)


def get_default_model() -> Sceptr:
if "_DEFAULT_MODEL" not in dir(sys.modules[__name__]):
setattr(sys.modules[__name__], "_DEFAULT_MODEL", variant.default())
return _DEFAULT_MODEL
67 changes: 0 additions & 67 deletions src/sceptr/sceptr/__init__.py

This file was deleted.

2 changes: 1 addition & 1 deletion tests/test_functional_api.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from sceptr import sceptr
import sceptr
import numpy as np
import pandas as pd
import pytest
Expand Down

0 comments on commit 8037ab4

Please sign in to comment.