How to apply a custom preprocessor to only specified features

I would like to extend auto-sklearn to handle datasets with both numerical and textual features. In particular, I want to implement a custom preprocessor that can take a textual feature and apply a TFIDF transformation.

This has raised a few concerns/questions in my head:

- Since AutoSklearn does not accept features of type `object`, I will have cast my text feature to type `category`, but I do not want the standard categorical preprocessors (e.g., OHE) to be executed on this text feature on accident. Is there a way to achieve this?
- How can I be sure that my custom preprocessor is only executed on my textual feature, and not the other (numeric) features? 

If the above is simply not possible with the current Auto Sklearn architecture, would you be interested in a pull request that would extend auto sklearn to handle textual features?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to apply a custom preprocessor to only specified features #1110

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to apply a custom preprocessor to only specified features #1110

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions