added an adaptive batching mechanism to handle large test sets based on estimated memory usage #262

Krishnadubey1008 · 2025-04-04T15:24:02Z

This PR fixes #125

Decription

added an adaptive batching mechanism to handle large test sets based on estimated memory usage. The default behavior can be overridden by expert users by adjusting the memory_saving_mode parameter.

changes made

Added a method to estimate memory usage.
Modified the predict_proba method to use adaptive batching.

…on estimated memory usage

Copilot

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Copilot · 2025-04-04T15:50:20Z

src/tabpfn/classifier.py

+            X[i:i + batch_size]
+            for output, config in self.executor_.iter_outputs(
+                X,


The slice expression 'X[i:i + batch_size]' is not assigned or used, which likely means the intended batch processing is incomplete. Consider assigning the slice to a variable (e.g. batch = X[i:i + batch_size]) and then processing it through the executor.

Suggested change

X[i:i + batch_size]

for output, config in self.executor_.iter_outputs(

X,

batch = X[i:i + batch_size]

for output, config in self.executor_.iter_outputs(

batch,

noahho · 2025-04-04T15:54:11Z

Hi @Krishnadubey1008,

Thanks for tackling issue #125 and adding the adaptive batching mechanism.

To get this merged, could you please address the following points:

Memory Estimation Function: Please move the memory estimation logic into a reusable function within our utils repository. This helps keep common utilities centralized. You can use this implementation as a reference: https://github.com/PriorLabs/tabpfn_common_utils/blob/524cee72cc6f33cf59fc943dc3e4b5428f3a79bc/expense_estimation.py#L9
CI Checks: The automated tests and the Ruff linter are currently failing. Please investigate the errors shown in the CI checks logs and apply the necessary fixes.
Copilot Suggestion: Please review the comment/suggestion made by GitHub Copilot on this PR, as it may contain relevant points.
Let me know if you have any questions!

Krishnadubey1008 added 2 commits April 4, 2025 15:13

added an adaptive batching mechanism to handle large test sets based …

5572007

…on estimated memory usage

fix_ruff

578ecb0

noahho requested a review from Copilot April 4, 2025 15:49

Copilot AI reviewed Apr 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added an adaptive batching mechanism to handle large test sets based on estimated memory usage #262

added an adaptive batching mechanism to handle large test sets based on estimated memory usage #262

Krishnadubey1008 commented Apr 4, 2025

Copilot AI left a comment

Copilot AI Apr 4, 2025

noahho commented Apr 4, 2025

added an adaptive batching mechanism to handle large test sets based on estimated memory usage #262

Are you sure you want to change the base?

added an adaptive batching mechanism to handle large test sets based on estimated memory usage #262

Conversation

Krishnadubey1008 commented Apr 4, 2025

Decription

changes made

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot AI Apr 4, 2025

Choose a reason for hiding this comment

noahho commented Apr 4, 2025