Skip to content
This repository was archived by the owner on Jul 10, 2021. It is now read-only.
This repository was archived by the owner on Jul 10, 2021. It is now read-only.

Slow learning for high-dimensional input  #237

@hollma

Description

@hollma

Dear developers,

first of all, thank you for this fine peace of free software!

I have been using scikit-neuralnetwork 0.7 with scikit-learn 0.18.2, theano 0.7.0, and lasagne 0.1, when I noticed that learning halfspaces seems to be quite slow when the examples are high-dimensional vectors, e.g. dim >= 1000. These dimensions are quite normal when using tfidf vectors in text classification settings.

A minimal working example (with runtime stats):
https://gist.github.com/hollma/f0d98bc5e58a6db34725dbce9ecdf9d1

Processing 500 training examples (100-dimensional) and validating the success on another 500 test examples took almost 14 seconds (on an Intel i7-4790 CPU running at 3.6 GHz).

What would you recommend? Are high-dimensional input vectors the wrong use-case for scikit-neuralnetwork, i.e. I should use some other library instead?

I am looking forward to your answer.

Best regards,
Mario

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions