Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue#809 fix docstrings in benchmarking #2646

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

adityagh006
Copy link

Fixes #809

This PR updates and corrects the docstrings for various benchmarking functions in the Aeon toolkit. The changes improve clarity, consistency, and adherence to documentation standards. The updated files include:

1.thresholding.py
2.clustering.py
3.segmentation.py
4.published_results.py
5.resampling.py
6.results_loaders.py
7.stats.py
No new dependencies. These changes improve documentation readability, making it easier for developers and users to understand function behavior and expected inputs/outputs. Updated docstrings for benchmarking functions, verified formatting and consistency with existing documentation, and ensured the PR title follows the required format: issue#809-fix-docstrings-in-benchmarking.

@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually.
I have added the following labels to this PR based on the changes made: [ $\color{#264F59}{\textsf{benchmarking}}$, $\color{#006b75}{\textsf{similarity search}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Push an empty commit to re-run CI checks

@adityagh006 adityagh006 force-pushed the issue#809-fix-docstrings-in-benchmarking branch from 872e65f to b949e59 Compare March 18, 2025 18:41
Copy link
Member

@baraline baraline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inconsistencies in changes, similarity search is out of scope of the PR


Returns
-------
float
``float``
Threshold such that there are at least `k` anomalous ranges.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing double quotes on k

Comment on lines +20 to +27
`y_true` : array-like of shape (n_samples,)
Ground truth target labels.
y_pred : array-like of shape (n_samples,)
`y_pred` : array-like of shape (n_samples,)
Cluster labels to evaluate.

Returns
-------
score : float
`score` : float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticks on variable names and not on types, inconsistent with the rest.

"clustering", "regression". Not case-sensitive.
as_list: boolean, default=False
If True, returns a list instead of a dataframe.
`task`: str, default="classification"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name, inconsistent with the rest


Returns
-------
data: pd.DataFrame or list
Standardised name as defined by NAME_ALIASES.
`data`: pd.DataFrame or list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

@@ -251,39 +251,43 @@ def get_estimator_results(

Parameters
----------
estimators : str ot list of str
`estimators` : str ot list of str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

include_missing is true.
num_resamples : int or None, default=None
``include_missing`` is ``true``.
`num_resamples` : int or None, default=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

datasets : list of or None, default=1
``get_available_estimators``, ``aeon.benchmarking.results_loading.NAME_ALIASES``
or the directory at path for valid options.
`datasets` : list of or None, default=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

Comment on lines +361 to +369
`task` : str, default="classification"
Should be one of ``aeon.benchmarking.results_loading.VALID_TASK_TYPES``. i.e.
`"classification"`, `"clustering"`, `"regression"`.
`measure` : str, default="accuracy"
Should be one of
`aeon.benchmarking.results_loading.VALID_RESULT_MEASURES[task]`.
Dependent on the task, i.e. for classification, `"accuracy"`, `"auroc"`,
`"balacc"` and regression, `"mse"`, `"mae"`, `"r2"`.
`remove_dataset_modifiers`: bool, default=False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

path : str, default="https://timeseriesclassification.com/results/ReferenceResults/"
i.e. a loaded result row for `"Dataset_eq"` will be converted to
just `"Dataset"`.
`path` : str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

Comment on lines +386 to +388
`results`: 2D numpy array
Array of scores. Each column is a results for a classifier, each row a dataset.
names: list of str
`names`: list of str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks on variable name and not on types, inconsistent with the rest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarking Benchmarking package similarity search Similarity search package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] Inconsistent double tick quotes in docstrings
2 participants