Fix cluster_loss error and add demo for batch correction with no bio labels #26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new utility function for joining multiple pandas DataFrames, improves robustness in the cluster loss calculation, and includes minor documentation and configuration updates. The most significant changes are detailed below.
New utility function:
df_joinerfunction tosrc/abaco/utils.pyfor joining multiple pandas DataFrames on a specified column, with precondition checks and support for different join types. (Fb5c2b54L457)Bug fix / robustness improvement:
cluster_lossmethod insrc/abaco/ABaCo.pyto handle the case where there are no positive KL divergence values, preventing errors when computing the minimum. [1] [2]Documentation and configuration:
tutorial/demo-mgnify-tomatoesto the documentation index indocs/index.md.jupytext_versionmetadata in tutorial scripts to 1.18.1 for consistency. [1] [2]pandasas a dependency insrc/abaco/utils.pyto support the new utility function.