Skip to content

Conversation

@esantorella
Copy link
Contributor

Summary:
Context:

default_data_constructor and default_data_type are used for a few purposes:

  1. Determining the type of empty data
  2. Determining the type of data that results from combining multiple Datas
  3. Validating that observations passed match the default_data_type on the experiment

Now that we have reduced our data classes down to just Data and MapData, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

This PR:

  1. Makes empty data Data
    2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
    2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
  2. Removes some validations that are no longer necessary

TODO for this PR:

  • Think more about backward compatibility and deprecation messages
  • Reverse the removal of Experiment's default_data_type argument

Some TODOs for follow-up diffs:

  • Stop letting Experiment accept a default_data_type argument
  • Remove Metric.data_constructor (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
  • Convert some Metric methods such as _unwrap_experiment_data into static methods or move them off Metric entirely now that they do not reference the data_constructor attribute

Differential Revision: D89689313

@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 22, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 22, 2025

@esantorella has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89689313.

esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

**Some TODOs for follow-up diffs:**
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Differential Revision: D89689313
@codecov-commenter
Copy link

codecov-commenter commented Dec 23, 2025

Codecov Report

❌ Patch coverage is 96.96970% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.69%. Comparing base (d1d20a4) to head (5eebbea).

Files with missing lines Patch % Lines
ax/core/experiment.py 85.71% 1 Missing ⚠️
ax/storage/sqa_store/decoder.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4691      +/-   ##
==========================================
- Coverage   96.69%   96.69%   -0.01%     
==========================================
  Files         580      580              
  Lines       60714    60694      -20     
==========================================
- Hits        58708    58688      -20     
  Misses       2006     2006              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`

**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Reviewed By: lena-kashtelyan

Differential Revision: D89689313
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`

**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Differential Revision:
D89689313

Privacy Context Container: L1307644

Reviewed By: lena-kashtelyan
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`

**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Differential Revision:
D89689313

Privacy Context Container: L1307644

Reviewed By: lena-kashtelyan
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`

**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Reviewed By: lena-kashtelyan

Differential Revision: D89689313
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`


**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Reviewed By: lena-kashtelyan

Differential Revision: D89689313
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 23, 2025
…a_type (facebook#4691)

Summary:

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`


**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Reviewed By: lena-kashtelyan

Differential Revision: D89689313
…a_type (facebook#4691)

Summary:

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`


**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Reviewed By: lena-kashtelyan

Differential Revision: D89689313
esantorella added a commit to esantorella/Ax that referenced this pull request Dec 26, 2025
…a_type (facebook#4691)

Summary:
Pull Request resolved: facebook#4691

**Context:**

`default_data_constructor` and `default_data_type` are used for a few purposes:
1. Determining the type of empty data
2. Determining the type of data that results from combining multiple `Data`s
3. Validating that observations passed match the `default_data_type` on the experiment

Now that we have reduced our data classes down to just `Data` and `MapData`, and there is only one map key, and those two classes now differ mainly in whether they have a "step" column, there is little reason to worry so much about tracking the intended type of data.

This PR brings us closer to unifying Data and MapData, because with this change, it should always be the case that a data is a MapData if and only if it has a "step" column; thus, there is no information contained in the class that can't be obtained by chacking whether there is a "step" column.

**This PR:**
1. Makes empty data `Data`
2a. When combining multiple datas, the result is MapData if one of the constituent objects is a MapData.
2b. When making a Data from a DataFrame, it should be a MapData if there is a "step" column
3. Removes some validations that are no longer necessary

* Removes `Experiment._default_data_type`
* Removes `Experiment.default_data_type`
* Removes `Experiment.default_data_constructor`

**Some TODOs for follow-up diffs:**
* Deprecate or remove `default_data_type` argument to experiment
* Remove `Metric.data_constructor` (if needed, replacing it with a boolean attribute indicating whether a progression will be produced)
* Convert some `Metric` methods such as `_unwrap_experiment_data` into static methods or move them off `Metric` entirely now that they do not reference the `data_constructor` attribute

Differential Revision:
D89689313

Privacy Context Container: L1307644

Reviewed By: lena-kashtelyan
@meta-codesync meta-codesync bot closed this in 385ee61 Dec 26, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 26, 2025

This pull request has been merged in 385ee61.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants