Add 1D time-series support and automatic input projection to TransformerEmbedding #1703

satwiksps · 2025-11-15T17:14:37Z

Support scalar time-series and simplify config handling in `TransformerEmbedding`

This PR improves the usability of TransformerEmbedding by adding native support for scalar (1D) time-series inputs and removing the requirement to manually specify a full configuration dictionary.

Problem

Described in Support scalar time-series and simplify config handling in TransformerEmbedding #1696:
TransformerEmbedding did not support scalar time-series shaped (batch, seq_len) or (batch, seq_len, 1).
Users were required to manually construct a complete config dictionary, which was unintuitive and prone to errors.
The model raised shape errors early and did not automatically project simple inputs.

What this PR does

Adds automatic projection from scalar inputs to the required feature_space_dim via a lazily-initialized input_proj layer.
Extends the forward pass to handle:
- (batch, seq_len)
- (batch, seq_len, 1)
- (batch, seq_len, D) (existing behavior)
Ensures compatibility with the attention mechanism (e.g., head dimension constraints).
Adds new test test_transformer_embedding_scalar_timeseries to verify:
- Correct handling of 1D inputs
- Proper projection behavior
- Successful integration into embedding API
Updates docstrings to reflect new behavior.

Why this is valuable

Makes the transformer embedding consistent with other embedding nets (RNN, CNN, LRU) that already accept scalar time-series.
Eliminates user friction by making TransformerEmbedding usable out of the box.
Enables simplified tutorials, workflows, and example code for time-series inference.

Testing

New test added in embedding_net_test.py.
All transformer-specific tests pass.
Pre-commit (ruff, formatting, linting) passed successfully.

Checklist

Closes #1696

…dd corresponding tests

codecov · 2025-11-15T17:21:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@2c216c2). Learn more about missing BASE report.
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1703   +/-   ##
=======================================
  Coverage        ?   84.67%           
=======================================
  Files           ?      137           
  Lines           ?    11493           
  Branches        ?        0           
=======================================
  Hits            ?     9732           
  Misses          ?     1761           
  Partials        ?        0

Flag	Coverage Δ
unittests	`84.67% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sbi/neural_nets/embedding_nets/transformer.py	`93.40% <100.00%> (ø)`

janfb

Thanks @satwiksps , good first draft!

I suggest to make all the config options actual arguments to __init__ to give full control to the user and have explicit type hints and defaults.

We could even think about using a dataclass TransformerConfig, but this would create additional overhead for the user, having to instantiate the dataclass config object externall, not sure. what's your take here?