Skip to content

Failing configurations #1337

Open
Open
@eddiebergman

Description

@eddiebergman
  • EDIT: After some investigation, this appears to have less to do with the configurations and more with imputation. After trying to recreate the failures, it seems that X arrays that reach a treshold of Nans end up causing the configurations to fail. These Nan's are added randomly and so it explains the infrequency of it.

  • Edit2: "fast_ica" with "fun":"exp" fails if there is a majority of Nans in the data.

  • Edit3: "fast_ica" with "fast_ica:whiten" : "False" fails with NaN's in the input.

  • Edit4: "fast_ica" with "whiten" : "False" fails even with no NaN values present.

  • Edit:5 "fast_ica" with "iris" dataset works, even with high occurence of Nan's, it seems that it is more dependant on the frequency of 0's in the dataset rather than Nan's.

  • Edit6: Trying to force a certain "feature:preprocessor:__choice__" is currently not possible. Trying to manually go in and edit the Config is not straight forward and should be approached when ConfigSpace.Configuration get's updated to allow for easier modificaiton of a Config. See this issue ConfigSpace #205 for why it's not straight forward to delete a key and add a new one.

We leave some randomness in the configurations that get tested when testing different classifier and regressor components, these are collected here:


  • Python version 3.8
  • Test
  • test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_sparse
Configuration:
  balancing:strategy, Value: 'weighting'
  classifier:__choice__, Value: 'sgd'
  classifier:sgd:alpha, Value: 7.27693595714389e-05
  classifier:sgd:average, Value: 'False'
  classifier:sgd:eta0, Value: 0.013654826040547558
  classifier:sgd:fit_intercept, Constant: 'True'
  classifier:sgd:learning_rate, Value: 'invscaling'
  classifier:sgd:loss, Value: 'log'
  classifier:sgd:penalty, Value: 'l1'
  classifier:sgd:power_t, Value: 0.5468767593727824
  classifier:sgd:tol, Value: 8.162675288740052e-05
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 467
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'kernel_pca'
  feature_preprocessor:kernel_pca:gamma, Value: 6.985386846337043
  feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
  feature_preprocessor:kernel_pca:n_components, Value: 10

  • Python version 3.8
  • Test
  • test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_signed_data
Configuration:
  balancing:strategy, Value: 'weighting'
  classifier:__choice__, Value: 'lda'
  classifier:lda:shrinkage, Value: 'auto'
  classifier:lda:tol, Value: 0.038890093430048595
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'minority_coalescer'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction, Value: 0.001521146558163954
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'none'
  feature_preprocessor:__choice__, Value: 'fast_ica'
  feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
  feature_preprocessor:fast_ica:fun, Value: 'exp'
  feature_preprocessor:fast_ica:whiten, Value: 'False'

  • Python version 3.10
  • Test
  • SimpleClassificationPipelineTest.test_configurations_sparse
 Configuration:
  balancing:strategy, Value: 'none'
  classifier:__choice__, Value: 'qda'
  classifier:qda:reg_param, Value: 0.7722372097734942
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'median'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1761
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'kernel_pca'
  feature_preprocessor:kernel_pca:gamma, Value: 2.351280410584469
  feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
  feature_preprocessor:kernel_pca:n_components, Value: 10

  • Python version 3.8
  • Test
  • SimpleClassificationPipelineTest.test_configurations_signed_data
Configuration:
  balancing:strategy, Value: 'none'
  classifier:__choice__, Value: 'gaussian_nb'
  data_preprocessor:__choice__, Value: 'feature_type'
  data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
  data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
  data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'most_frequent'
  data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1004
  data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
  feature_preprocessor:__choice__, Value: 'fast_ica'
  feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
  feature_preprocessor:fast_ica:fun, Value: 'exp'
  feature_preprocessor:fast_ica:whiten, Value: 'False'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions