Failing configurations

* EDIT: After some investigation, this appears to have less to do with the configurations and more with imputation. After trying to recreate the failures, it seems that `X` arrays that reach a treshold of Nans end up causing the configurations to fail. These Nan's are added randomly and so it explains the infrequency of it.

* Edit2: `"fast_ica"` with `"fun":"exp"` fails if there is a majority of Nans in the data.

* Edit3: `"fast_ica"` with `"fast_ica:whiten" : "False"` fails with NaN's in the input.

* Edit4: `"fast_ica"` with `"whiten" : "False"` fails even with no NaN values present.

* Edit:5 `"fast_ica"` with `"iris"` dataset works, even with high occurence of Nan's, it seems that it is more dependant on the frequency of 0's in the dataset rather than Nan's.

* Edit6: Trying to force a certain `"feature:preprocessor:__choice__"` is currently not possible. Trying to manually go in and edit the Config is not straight forward and should be approached when `ConfigSpace.Configuration` get's updated to allow for easier modificaiton of a `Config`. See this issue [ConfigSpace #205](https://github.com/automl/ConfigSpace/issues/205) for why it's not straight forward to delete a key and add a new one.

We leave some randomness in the configurations that get tested when testing different classifier and regressor components, these are collected here:

---
* Python version 3.8
* [Test](https://github.com/automl/auto-sklearn/runs/4446897874?check_suite_focus=true)
* `test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_sparse`

```
Configuration:
  balancing:strategy, Value: 'weighting'
  classifier:__choice__, Value: 'sgd'
  classifier:sgd:alpha, Value: 7.27693595714389e-05
  classifier:sgd:average, Value: 'False'
  classifier:sgd:eta0, Value: 0.013654826040547558
  classifier:sgd:fit_intercept, Constant: 'True'
  classifier:sgd:learning_rate, Value: 'invscaling'
  classifier:sgd:loss, Value: 'log'
  classifier:sgd:penalty, Value: 'l1'
  classifier:sgd:power_t, Value: 0.5468767593727824
  classifier:sgd:tol, Value: 8.162675288740052e-05
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 467
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'kernel_pca'
  feature_preprocessor:kernel_pca:gamma, Value: 6.985386846337043
  feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
  feature_preprocessor:kernel_pca:n_components, Value: 10
```

---
* Python version 3.8
* [Test](https://github.com/automl/auto-sklearn/runs/4237539251?check_suite_focus=true) 
* `test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_signed_data`

```
Configuration:
  balancing:strategy, Value: 'weighting'
  classifier:__choice__, Value: 'lda'
  classifier:lda:shrinkage, Value: 'auto'
  classifier:lda:tol, Value: 0.038890093430048595
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'minority_coalescer'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction, Value: 0.001521146558163954
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'none'
  feature_preprocessor:__choice__, Value: 'fast_ica'
  feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
  feature_preprocessor:fast_ica:fun, Value: 'exp'
  feature_preprocessor:fast_ica:whiten, Value: 'False'
```

---

* Python version 3.10
* [Test](https://github.com/automl/auto-sklearn/runs/4455775008?check_suite_focus=true)
* `SimpleClassificationPipelineTest.test_configurations_sparse`
```
 Configuration:
  balancing:strategy, Value: 'none'
  classifier:__choice__, Value: 'qda'
  classifier:qda:reg_param, Value: 0.7722372097734942
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'median'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1761
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'kernel_pca'
  feature_preprocessor:kernel_pca:gamma, Value: 2.351280410584469
  feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
  feature_preprocessor:kernel_pca:n_components, Value: 10
```

---

* Python version 3.8
* [Test](https://github.com/automl/auto-sklearn/runs/4455775398?check_suite_focus=true)
* `SimpleClassificationPipelineTest.test_configurations_signed_data`

```
Configuration:
  balancing:strategy, Value: 'none'
  classifier:__choice__, Value: 'gaussian_nb'
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'most_frequent'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1004
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'fast_ica'
  feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
  feature_preprocessor:fast_ica:fun, Value: 'exp'
  feature_preprocessor:fast_ica:whiten, Value: 'False'
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failing configurations #1337

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failing configurations #1337

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions