Implementing Fine-Tuning and Prompt-Tuning for TabPFN #273

tleemann · 2025-04-10T05:48:09Z

Creating a pull request for Finetuning and Prompt tuning.

What has not been done yet, but might be nice to have:

Create tests, check typing with mypy

CLAassistant · 2025-04-10T05:48:19Z

All committers have signed the CLA.

iivalchev · 2025-04-12T12:19:28Z

@tleemann Kudos for making the effort! Please note some work for fine-tunning is being done here: https://github.com/LennartPurucker/finetune_tabpfn_v2, not sure if your approach is somewhat different but the change looks quite large. In the repo referenced the goal is not to interfere with the main code base for the purposes of fine-tuning.

tleemann · 2025-04-12T13:36:27Z

@tleemann Kudos for making the effort! Please note some work for fine-tunning is being done here: https://github.com/LennartPurucker/finetune_tabpfn_v2, not sure if your approach is somewhat different but the change looks quite large. In the repo referenced the goal is not to interfere with the main code base for the purposes of fine-tuning.

@iivalchev Thanks for pointing to your codebase! Don't worry, I was in touch with the maintainers before creating this request.
We thought it could be nice to additionally have native fine-tuning support with the changes in this request, which directly allows for:

Finetuning on multiple datasets
Efficient parallel preprocessing of multiple datasets in native torch.DataLoaders
Prompt-Tuning

iivalchev · 2025-04-12T13:41:07Z

Right! I am quite keen on the strategies for training on multiple datasets. Is there anything done in that regard?

tleemann · 2025-04-12T13:47:58Z

Yes, that's a challenging task. The idea here was to preprocess the datasets offline for additional speed through parallelism (preprocessing uses numpy). The classifier now has a get_preprocessed_datasets-function, which returns a Meta-Dataset that can be used in a torch.utils.data.DataLoader to quickly load different preprocessed datasets (or batches of a single dataset) for finetuning.

Please have a look at the example in examples/tabpfn_finetune.py (in the request), where I tried to give an example of the usage (the example only uses one dataset, but it can be easily extended by just passing a list of datasets the get_preprocessed_datasets-function). Hope this helps.

Best,
Tobias

iivalchev · 2025-04-12T14:28:52Z

Cool! Will do so. Do you think test-time training can also be accommodated? And I assume fine-tunning for the regressor would also be supported?

tleemann added 9 commits March 14, 2025 22:44

initial commit making TabPFN differentiable.

c4d5dde

Implemented Dataloader. Caution, unstable version.

29f3584

Implemented Finetuning.

d3c3d32

Tested finetuning and prompttuning.

5cd33e1

merge.

fb66e84

refactored collate function + preparations for regressor.

f1ea4d4

fix syntax.

ca4d11e

Clean up example + change to covertype dataset.

7baa974

Code Format and remove padding from inference engines.

4934143

Fix code formatting and failing tests.

fe9f808

pass rng object to preserve determinism.

2de947e

tleemann changed the title ~~Implementing Fine-Tuning and Prompt Tuning for TabPFN~~ Implementing Fine-Tuning and Prompt-Tuning for TabPFN Apr 12, 2025

ruff formatting

4cb75bc

iivalchev mentioned this pull request Apr 14, 2025

Make the Validation Use TabPFN Sklearn Interface LennartPurucker/finetune_tabpfn_v2#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Fine-Tuning and Prompt-Tuning for TabPFN #273

Implementing Fine-Tuning and Prompt-Tuning for TabPFN #273

tleemann commented Apr 10, 2025 •

edited

Loading

CLAassistant commented Apr 10, 2025 •

edited

Loading

iivalchev commented Apr 12, 2025 •

edited

Loading

tleemann commented Apr 12, 2025 •

edited

Loading

iivalchev commented Apr 12, 2025

tleemann commented Apr 12, 2025

iivalchev commented Apr 12, 2025

Implementing Fine-Tuning and Prompt-Tuning for TabPFN #273

Are you sure you want to change the base?

Implementing Fine-Tuning and Prompt-Tuning for TabPFN #273

Conversation

tleemann commented Apr 10, 2025 • edited Loading

CLAassistant commented Apr 10, 2025 • edited Loading

iivalchev commented Apr 12, 2025 • edited Loading

tleemann commented Apr 12, 2025 • edited Loading

iivalchev commented Apr 12, 2025

tleemann commented Apr 12, 2025

iivalchev commented Apr 12, 2025

tleemann commented Apr 10, 2025 •

edited

Loading

CLAassistant commented Apr 10, 2025 •

edited

Loading

iivalchev commented Apr 12, 2025 •

edited

Loading

tleemann commented Apr 12, 2025 •

edited

Loading