Add new function modifier `upack_fields` #1303

cswartzvi · 2025-04-06T13:33:58Z

Added a new modifier called unpack_fields which allows for the extraction of fields from a tuple output (a cross between extract_columns and extract_fields).

Changes

Added unpack_fields decorator to hamilton/function_modifiers/__init__.py and implemented its logic in hamilton/function_modifiers/expanders.py. This decorator enables the extraction of fields from a tuple output, expanding a single function into multiple nodes.

How I tested this

Added multiple test cases in tests/function_modifiers/test_expanders.py to validate the functionality of the unpack_fields decorator, including tests for valid and invalid type annotations and different tuple configurations.

Notes

Example usage with an fixed size tuple:

  @unpack_fields("X_train" "X_validation", "X_test")
  def dataset_splits(X: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
      """Randomly split data into train, validation, test"""
      X_train, X_validation, X_test = random_split(X)
      return X_train, X_validation, X_test

Example usage with a subset of fixed length tuples:

  @unpack_fields("X_train")
  def dataset_splits(X: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
      """Randomly split data into train, validation, test"""
      X_train, X_validation, X_test = random_split(X)
      return X_train, X_validation, X_test

Example usage with an indeterminate length tuple:

  @unpack_fields("X_train" "X_validation", "X_test")
  def dataset_splits(X: np.ndarray) -> Tuple[np.ndarray, ...]:
      """Randomly split data into train, validation, test"""
      X_train, X_validation, X_test = random_split(X)
      return X_train, X_validation, X_test

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

ellipsis-dev

❌ Changes requested. Reviewed everything up to 46608a6 in 2 minutes and 21 seconds

More details

Looked at 351 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 8 drafted comments based on config settings.

1. hamilton/function_modifiers/expanders.py:952

Draft comment:
Consider using an explicit check (e.g. if field_types is None:) here instead of a generic falsy check to clarify that a valid empty list is not expected.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
Looking at _process_unpack_fields(), it explicitly returns None if the base_type is not a tuple. The function never returns an empty list. So while an explicit None check would be more precise, the current falsy check works correctly and is idiomatic Python. The suggestion doesn't fix a real issue.
The comment makes a valid point about being explicit. In some cases, using is None can prevent subtle bugs when falsy values are valid. However, in this specific case...
Looking at the implementation, None is the only falsy value that can be returned by _process_unpack_fields(). An empty list is never a valid return value, so the current check is appropriate.
The comment should be deleted. The current code is correct and clear enough - the function explicitly returns None for invalid cases and the falsy check is a standard Python idiom that works correctly here.

2. hamilton/function_modifiers/expanders.py:994

Draft comment:
The extractor functions (both async and sync) are defined inline. For clarity and maintainability, consider extracting them to a helper function to avoid closure-related pitfalls.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50%
The comment raises a valid code quality concern - defining functions in loops with closures can be tricky and error-prone. The code is using default arguments (field_index=idx) to avoid closure issues, but extracting to helper functions could make this pattern more explicit and maintainable. However, the current implementation seems to be working as intended and the closure pitfall is already handled.
The comment may be overstating the issue since the code already handles closure problems via default arguments. Moving the functions could add complexity without clear benefit.
While the suggestion is technically valid, the current implementation is already handling closure issues appropriately and the code is reasonably clear as is.
The comment should be deleted since it suggests a refactor that isn't clearly beneficial given the current working implementation.

3. hamilton/function_modifiers/expanders.py:890

Draft comment:
When handling indeterminate length tuples (with ellipsis), the code creates a new tuple by repeating the first type. Ensure that this behavior is well-documented (e.g., in Sphinx docs) so users understand that unpacked types will be repeated based on the number of fields provided.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
The comment is suggesting adding documentation about an implementation detail. While the behavior is important, the comment is asking for documentation rather than pointing out a code issue that needs to be fixed. The rules state "Do NOT make purely informative comments" and "Do NOT comment unless there is clearly a code change required". Documentation suggestions should go through other channels.
The behavior being described is non-obvious and could surprise users - having it documented would help prevent confusion. The comment is trying to improve code clarity.
While documentation would be helpful, the review rules explicitly state not to make purely informative comments or documentation requests. These should be handled through other means like documentation tickets.
Delete the comment since it is purely requesting documentation and doesn't require any code changes, which violates the review rules.

4. tests/function_modifiers/test_expanders.py:528

Draft comment:
Typo: The test function name 'test_upack_fields_valid_indeterminate_tuple' should be 'test_unpack_fields_valid_indeterminate_tuple' for consistency.
Reason this comment was not posted:
Marked as duplicate.

5. docs/:0

Draft comment:
Please add Sphinx documentation for the new unpack_fields modifier, including usage examples, accepted return types (explicit and indeterminate tuples), and error cases.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

6. hamilton/function_modifiers/expanders.py:1108

Draft comment:
Typo: In the comment on line 1108, 'maeks' should be corrected to 'makes' for clarity.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

7. tests/function_modifiers/test_expanders.py:28

Draft comment:
Typographical error: The word 'non_existant' appears in the parameters. It should be spelled 'non_existent'.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

8. tests/function_modifiers/test_expanders.py:527

Draft comment:
Typographical inconsistency: The test function name 'test_upack_fields_valid_indeterminate_tuple' seems to contain a typo ('upack' instead of 'unpack'). Please rename it to 'test_unpack_fields_valid_indeterminate_tuple' for consistency with the rest of the code.
Reason this comment was not posted:
Marked as duplicate.

Workflow ID: wflow_cm8UzLgA8Mp8aXYO

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

hamilton/function_modifiers/expanders.py

tests/function_modifiers/test_expanders.py

hamilton/function_modifiers/__init__.py

cswartzvi · 2025-04-06T14:30:32Z

Geez - apparently I have a problem remembering to include backward compatible type hints 🙃. Regardless, this is ready to be looked at!

elijahbenizzy

This is awesome! Thank you! I'm really looking forward to shipping this. Some small comments.

docs/concepts/function-modifiers.rst

hamilton/function_modifiers/expanders.py

cswartzvi · 2025-04-08T03:12:21Z

Wooh boy, TIL that improper use of an ellipsis in typing.Tuple (ex: Tuple[..., int], Tuple[int, int, ...], etc.) is an error prior to Python 3.11, even though it's perfectly fine in tuple for python 3.9+ (ex: tuple[..., int], tuple[int, int, ...], etc.) 🤷🏻‍♂️🤷🏻‍♂️. I have greatly expanded tests of tuple vs typing.Tuple in both valid and invalid cases using a little trick with eval. Note that this was the only way I could come up with where earlier python versions didn't complain about indexing tuple. Please let me know if it's too ... crazy.

cswartzvi · 2025-04-08T03:24:40Z

Just wanted to mention one other thing ... in slack @zilto mentioned that

' .. with tuple[str, ...] the @unpack_fields should be authoritative for the number of expected tuple values ...'.

Currently, @unpack_fields is extracting the first $n$ fields of an $m$ length tuple with only a check for $0 \leq n < m$ in the local extractor function. I could make the (sync/async) tuple_generator check that the requested fields match the ouput, but I wanted to get everyone's take first.

Sorry @zilto, I didn't see your response until after I started pushing commits!

zilto · 2025-04-09T00:14:28Z

@cswartzvi Great job on the PR! It makes me very happy to see docs pages included 😄

Regarding tuple length, I see the merit of your approach and implementation. For example, in this snippet:

  @unpack_fields("X_train")
  def dataset_splits(X: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
      """Randomly split data into train, validation, test"""
      X_train, X_validation, X_test = random_split(X)
      return X_train, X_validation, X_test

It seems reasonable to allow @unpack_fields("X_train") to only unpack X_train, the first element of the tuple. This matches how @extract_fields can selectively extract keys of a dict. It avoids creating unnecessary nodes and the node dataset_splits allows to get the full tuple.

elijahbenizzy

Looks good!

ellipsis-dev bot reviewed Apr 6, 2025

View reviewed changes

hamilton/function_modifiers/expanders.py Outdated Show resolved Hide resolved

tests/function_modifiers/test_expanders.py Outdated Show resolved Hide resolved

hamilton/function_modifiers/__init__.py Show resolved Hide resolved

elijahbenizzy reviewed Apr 7, 2025

View reviewed changes

docs/concepts/function-modifiers.rst Show resolved Hide resolved

hamilton/function_modifiers/expanders.py Show resolved Hide resolved

hamilton/function_modifiers/expanders.py Outdated Show resolved Hide resolved

hamilton/function_modifiers/expanders.py Outdated Show resolved Hide resolved

cswartzvi added 13 commits April 7, 2025 20:17

Initial implementation of unpack_fields

eae55d4

Initial tests for unpack_fields

2f68edc

Add documentation for unpack_fields

83980bf

Use Optional for backward compatibility

769b20d

Fix typo in test name test_unpack_fields_valid_indeterminate_tuple

80d85f1

Fix docstring of _process_unpack_fields

7a651a5

Split unpack_field annotation tests for backward compatibility

0d4accf

Removed 'future' type annotations for backward compatibility

5777226

More backward compatibility changes

b99d900

Move error handling to _process_unpack_fields

71288f1

Remove redundant asynchronous extractor for unpack_fields

096c0a0

Expand _process_unpack_fields check for tuple[..., int]

b9b9c78

Expand unpack_fields annotation tests

9546cc8

cswartzvi force-pushed the add_upack_fields branch from 186d8dd to 9546cc8 Compare April 8, 2025 03:01

Fix some docstring typos

d35df6a

This was referenced May 9, 2025

Update extract fields #1305

Merged

Combined extract decorator #121

Open

elijahbenizzy approved these changes May 19, 2025

View reviewed changes

skrawcz approved these changes Jun 23, 2025

View reviewed changes

skrawcz merged commit ebfddc1 into apache:main Jun 23, 2025
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add new function modifier `upack_fields` #1303

Add new function modifier `upack_fields` #1303

Uh oh!

cswartzvi commented Apr 6, 2025

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cswartzvi commented Apr 6, 2025

Uh oh!

elijahbenizzy left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cswartzvi commented Apr 8, 2025 •

edited

Loading

Uh oh!

cswartzvi commented Apr 8, 2025

Uh oh!

zilto commented Apr 9, 2025

Uh oh!

elijahbenizzy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add new function modifier upack_fields #1303

Add new function modifier upack_fields #1303

Uh oh!

Conversation

cswartzvi commented Apr 6, 2025

Changes

How I tested this

Notes

Checklist

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cswartzvi commented Apr 6, 2025

Uh oh!

elijahbenizzy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cswartzvi commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cswartzvi commented Apr 8, 2025

Uh oh!

zilto commented Apr 9, 2025

Uh oh!

elijahbenizzy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add new function modifier `upack_fields` #1303

Add new function modifier `upack_fields` #1303

cswartzvi commented Apr 8, 2025 •

edited

Loading