Add per-timeframe feature processing support#178
Conversation
Enable different feature processing functions for each timeframe,
allowing different feature engineering logic and output dimensions.
Changes:
- MarketDataObservationSampler: Accept list of callables for
per-timeframe processing (backward compatible with single callable)
- Add get_feature_keys_per_timeframe() and get_num_features_per_timeframe()
- Update _build_observation_specs() to handle per-TF feature counts
- Update SequentialTradingEnv to use per-TF feature counts
Usage:
env = SequentialTradingEnv(
df, config,
feature_preprocessing_fn=[process_1min, process_5min]
)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address test coverage gaps identified by PR review: - Add TestOneStepPerTimeframeFeatures class with tests for different feature dimensions per timeframe in OneStep environments - Add TestSLTPPerTimeframeFeatures class with tests for different feature dimensions per timeframe in SLTP environments - Add multi-step episode test to verify observation shapes remain consistent throughout an entire episode (not just reset and first step) All 461 offline tests pass. Refs: #177 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address high-priority gaps from test analyzer: 1. Add validation in sampler.py: raise ValueError if feature processing function produces no columns starting with 'features_' 2. Add validation in offline_base.py: raise ValueError if num_features dict is missing keys for any timeframe Tests added: - test_feature_processing_fn_with_no_features_raises - test_feature_processing_fn_partial_failure_raises - test_feature_processing_with_no_features_raises (env-level) All 464 offline tests pass. Refs: #177 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Apply test-justification skill to eliminate duplicate coverage: Deleted (8 tests): - test_sequential.py: test_env_single_function_backward_compatible (covered by sampler test - env just passes through) - test_sequential.py: test_feature_processing_with_no_features_raises (covered by sampler test - validation is in sampler) - test_sequential_sltp.py: TestSLTPPerTimeframeFeatures class (3 tests) (SLTP inherits from base - no SLTP-specific code path for this feature) - test_onestep.py: TestOneStepPerTimeframeFeatures class (3 tests) (OneStep inherits from base - no OneStep-specific code path) Consolidated (2 tests → 1 parametrized): - test_feature_processing_fn_with_no_features_raises - test_feature_processing_fn_partial_failure_raises → Single parametrized test with [single_tf, multi_tf_partial] cases Result: 456 tests (down from 464), all passing. Test reduction: 47% fewer per-timeframe feature tests with same coverage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Merge test_env_with_different_feature_dimensions and test_multi_step_episode_maintains_shapes into single test. The multi-step test is a superset - it verifies spec shapes, reset shapes, AND shape consistency over 30 steps. Result: 455 tests (down from 456), all passing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@claude review |
|
Claude finished @BY571's task in 1m 52s —— View job Code ReviewTodo List
Review SummaryThis PR successfully implements per-timeframe feature processing support while maintaining full backward compatibility. The implementation is well-structured, thoroughly tested, and includes excellent error handling with informative messages. Strengths1. Clean API Design ✅
2. Excellent Error Handling ✅The validation is comprehensive and provides actionable error messages: # sampler.py:100-105
if len(cols_to_keep) == 0:
raise ValueError(
f"Feature processing function for timeframe {tf.obs_key_freq()} "
f"produced no columns starting with '{features_start_with}'. "
f"Ensure your function adds columns like 'features_close', 'features_volume', etc."
)# offline_base.py:157-161
if missing_keys:
raise ValueError(
f"num_features dict is missing keys for timeframes: {missing_keys}. "
f"Expected keys: {market_data_keys}, got: {list(num_features.keys())}"
)These error messages include:
3. Comprehensive Testing ✅
4. Thoughtful Backward Compatibility ✅The # sampler.py:368-376
def get_feature_keys(self) -> List[str]:
"""Get feature columns (raises error if timeframes have different features).
For backward compatibility with single-function feature processing.
If timeframes have different features, use get_feature_keys_per_timeframe() instead.
"""
if not all(c == columns[0] for c in columns):
raise ValueError(
"Timeframes have different features. Use get_feature_keys_per_timeframe() instead. "
f"Feature counts per timeframe: {self.get_num_features_per_timeframe()}"
)This is better than the old
5. Clean Implementation ✅
Minor Observations1. Type Annotation ConsistencyIn processing_fns: List[Optional[Callable]] = [None] * len(time_frames)But in lines 57 and 60, the type annotation is omitted: processing_fns = [feature_processing_fn] * len(time_frames)
processing_fns = list(feature_processing_fn)Suggestion: For consistency, you could add type annotations to all branches, but this is a very minor point and the code is clear as-is. 2. Documentation ClarityThe new methods have excellent docstrings:
All docstrings follow the existing style. 3. Test CoverageThe test coverage is excellent:
Code Quality Metrics
Specific File Reviews
|
Summary
[process_1min, process_5min])Closes #177
Changes
Core Implementation:
sampler.py: Accept list of processing functions, addget_num_features_per_timeframe()andget_feature_keys_per_timeframe()methodsoffline_base.py: Update_build_observation_specs()to handle per-timeframe feature counts with validationsequential.py: Use per-timeframe feature counts for spec buildingValidation:
ValueErrorif feature processing function produces nofeatures_*columnsValueErrorifnum_featuresdict is missing timeframe keysTest plan
🤖 Generated with Claude Code