Skip to content

How to apply a custom preprocessor to only specified features #1110

Open
@stepthom

Description

@stepthom

I would like to extend auto-sklearn to handle datasets with both numerical and textual features. In particular, I want to implement a custom preprocessor that can take a textual feature and apply a TFIDF transformation.

This has raised a few concerns/questions in my head:

  • Since AutoSklearn does not accept features of type object, I will have cast my text feature to type category, but I do not want the standard categorical preprocessors (e.g., OHE) to be executed on this text feature on accident. Is there a way to achieve this?
  • How can I be sure that my custom preprocessor is only executed on my textual feature, and not the other (numeric) features?

If the above is simply not possible with the current Auto Sklearn architecture, would you be interested in a pull request that would extend auto sklearn to handle textual features?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementA new improvement or feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions