Skip to content
Ryan Szeto edited this page Feb 24, 2020 · 9 revisions

Welcome to the d3m_michigan_primitives wiki!

Performance

Last updated: 2/7/2020

This section keeps track of the expected performance of each pipeline, as well as how it ranked on the leaderboard.

  • Pipeline: Name of pipeline
  • Metric: The metric used for scoring on the dataset
  • Score: The score obtained by our pipeline
  • Baseline: The score obtained by the baseline pipeline
  • Commit: The commit used to obtain our score
Pipeline Metric Score Baseline Commit
EKSSOneHundredPlantsMarginPipeline NORMALIZED_MUTUAL_INFORMATION 0.8277211761287837 0.816731 d040be3fc7070669a081df53d9e5117a2349234d
GRASTAAutoMPGPipeline MEAN_SQUARED_ERROR 828.1976218564781 7.37077 d040be3fc7070669a081df53d9e5117a2349234d
GRASTAAutoPricePipeline MEAN_SQUARED_ERROR 7846794.96320418 6985720 d040be3fc7070669a081df53d9e5117a2349234d
KSSOneHundredPlantsMarginPipeline NORMALIZED_MUTUAL_INFORMATION 0.8050467420628115 0.816731 d040be3fc7070669a081df53d9e5117a2349234d
OWLRegressionAutoPricePipeline MEAN_SQUARED_ERROR 5387819.182395744 6985720 d040be3fc7070669a081df53d9e5117a2349234d
SSCADMMOneHundredPlantsMarginPipeline NORMALIZED_MUTUAL_INFORMATION 0.6651953038636657 0.816731 d040be3fc7070669a081df53d9e5117a2349234d
SSCCVXOneHundredPlantsMarginPipeline NORMALIZED_MUTUAL_INFORMATION 0.7627863380916519 0.816731 d040be3fc7070669a081df53d9e5117a2349234d
SSCOMPOneHundredPlantsMarginPipeline NORMALIZED_MUTUAL_INFORMATION 0.5807331371730341 0.816731 d040be3fc7070669a081df53d9e5117a2349234d

Baseline methods

This section describes the baseline algorithm for each dataset of interest. These were determined from reading the "solution" folder inside each dataset. For example, the baseline for 196_autoMpg_MIN_METADATA was found under /z/mid/D3M/datasets/seed_datasets_current/196_autoMpg_MIN_METADATA/196_autoMpg_solution.

196_autoMpg_MIN_METADATA

  1. Impute missing values, normalize numerical values, etc.
  2. Select features from lasso regression with scikit-learn
  3. Fit selected features with SGD linear regressor

LL0_207_autoPrice_MIN_METADATA

  1. Impute missing values, normalize numerical values, etc.
  2. Selects some top-performing features using SelectPercentile from scikit-learn
  3. Fit selected features with SGD linear regressor

1491_one_hundred_plants_margin_clust

  1. Impute missing values, normalize numerical values, etc.
  2. Do K-means with 100 clusters with scikit-learn

22_handgeometry_MIN_METADATA

  1. Subtract mean hand image
  2. Extract pool5/7x7_s1 activations from DeepHand DNN model via Caffe
  3. Fit activations with SVR

How to print from a primitive when running a pipeline

These instructions only work when used inside a primitive definition (example), NOT a pipeline definition (example).

  1. Add the following in the script header (in/near the beginning):

    import logging
    logger = logging.getLogger(__name__)
  2. Use the following to print something to the logger:

    logger.warn('hello world')
    • Logging level must be "warn" or greater (i.e., warn(), error(), or critical()) to appear. More information on these methods can be found here.
    • Don't forget to cast arguments as strings, e.g., logger.warn(str(X)) if X is a numpy array.