Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: TransformerLensOrg/TransformerLens
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: TransformerLensOrg/TransformerLens
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: dev-4.x
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 12 commits
  • 154 files changed
  • 7 contributors

Commits on May 11, 2026

  1. fix: Bump CI actions and stabilize flaky notebook checks (#1290)

    * ci: Bump all GitHub Actions to latest stable versions
    
    checkout@v3 -> v4
    cache@v3 -> v4
    cache/restore@v3 -> v4
    upload-artifact@v4 -> v7
    download-artifact@v4 -> v8
    setup-uv@v6 -> v8
    
    * ci: Add workflow_dispatch trigger to checks.yml
    
    * fix: Use setup-uv@v7 (v8 major tag not published)
    
    * fix: Use setup-uv@v7 in release.yml too
    
    * fix: Truncate floats to 3 decimal places in notebook comparisons
    
    * fix: Scale-aware float truncation for notebook comparisons
    
    * fix: Significant figures (3) for small-number notebook comparisons
    huseyincavusbi authored May 11, 2026
    Configuration menu
    Copy the full SHA
    f3fc9a8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0e9dced View commit details
    Browse the repository at this point in the history

Commits on May 12, 2026

  1. Issue resolution for #341, #644, and #210 (#1300)

    * Resolution for #644 and #341
    
    * Started activation cache improvement
    
    * Full resolution for 210 + a demo notebook
    jlarson4 authored May 12, 2026
    Configuration menu
    Copy the full SHA
    bb15019 View commit details
    Browse the repository at this point in the history
  2. Resolution for #796, #453, #385, and #297 (#1301)

    * Resolution for #644 and #341
    
    * Started activation cache improvement
    
    * Full resolution for 210 + a demo notebook
    
    * Resolution for #796, Factored Matrix memory leak
    
    * Resolved #453, underlying issue
    
    * Resolution for Issue #385, added notes about forced eager, added a test to check for future drift
    
    * Added hook introspection mixin for #297
    
    * Made improvements to booting training revisions
    jlarson4 authored May 12, 2026
    Configuration menu
    Copy the full SHA
    abf539d View commit details
    Browse the repository at this point in the history
  3. Improve Architecture Adapter Testing (#1303)

    * Resolution for #644 and #341
    
    * Started activation cache improvement
    
    * Full resolution for 210 + a demo notebook
    
    * Resolution for #796, Factored Matrix memory leak
    
    * Resolved #453, underlying issue
    
    * Resolution for Issue #385, added notes about forced eager, added a test to check for future drift
    
    * Added hook introspection mixin for #297
    
    * Made improvements to booting training revisions
    
    * Adapter test improvements
    
    * format cleanup
    jlarson4 authored May 12, 2026
    Configuration menu
    Copy the full SHA
    3bcf3d3 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2026

  1. Resolution for #112 and #830 (#1304)

    * Resolution for #644 and #341
    
    * Started activation cache improvement
    
    * Full resolution for 210 + a demo notebook
    
    * Resolution for #796, Factored Matrix memory leak
    
    * Resolved #453, underlying issue
    
    * Resolution for Issue #385, added notes about forced eager, added a test to check for future drift
    
    * Added hook introspection mixin for #297
    
    * Made improvements to booting training revisions
    
    * Adapter test improvements
    
    * format cleanup
    
    * Adding a way to display logit vector for #112 and add type hinting for #830
    
    * Type issue resolution – Resolving import confusion between TransformerBridgeConfig module and class, which were separate entities sharing the same name. Files renamed to properly differentiate
    
    * Format and type fixes
    
    * Removed unnecssary assertion
    jlarson4 authored May 13, 2026
    Configuration menu
    Copy the full SHA
    287a542 View commit details
    Browse the repository at this point in the history

Commits on May 15, 2026

  1. Add OPT architecture adapter tests (#1305)

    * Fix type of HookedTransformerConfig.device (#1230)
    
    * Fix type of HookedTransformerConfig.device
    
    This is typed as `Optional[str]` but sometimes returns `torch.device`.
    Updated the code to just return the `str` instead of wrapping with a
    device.
    
    I'm not confident that every function which takes a device will
    always be passed a string, so I didn't change functions like
    warn_if_mps.
    
    Found while working on #1219
    
    * more cleanup
    
    * 3.0 CI Bugs (#1261)
    
    * Fixing `utils` imports
    
    * skip gated notebooks on PR from forks
    
    * Updating notebooks
    
    * Ensure LLaMA only runs when HF_TOKEN is available
    
    ---------
    
    Co-authored-by: jlarson4 <jonahalarson@comcast.net>
    
    * test: add OPT architecture adapter coverage
    
    * chore: rerun CI
    
    ---------
    
    Co-authored-by: Brendan Long <self@brendanlong.com>
    Co-authored-by: jlarson4 <jonahalarson@comcast.net>
    3 people authored May 15, 2026
    Configuration menu
    Copy the full SHA
    9a7ebf7 View commit details
    Browse the repository at this point in the history

Commits on May 18, 2026

  1. Add GPT2 and Gpt2LM Head architecture adapter tests (#1306)

    * add Gpt2 MOdel Bridge tests
    
    * removing unused params
    
    * adding missing bos and attn checks
    
    * updating FatoryRegistration test
    
    * formatting via black
    
    * adding custom head GPT2 tests
    sunny1401 authored May 18, 2026
    Configuration menu
    Copy the full SHA
    8e49aac View commit details
    Browse the repository at this point in the history

Commits on May 19, 2026

  1. Qwen3.5 text-only TransformerBridge support (#1313)

    * Fix type of HookedTransformerConfig.device (#1230)
    
    * Fix type of HookedTransformerConfig.device
    
    This is typed as `Optional[str]` but sometimes returns `torch.device`.
    Updated the code to just return the `str` instead of wrapping with a
    device.
    
    I'm not confident that every function which takes a device will
    always be passed a string, so I didn't change functions like
    warn_if_mps.
    
    Found while working on #1219
    
    * more cleanup
    
    * 3.0 CI Bugs (#1261)
    
    * Fixing `utils` imports
    
    * skip gated notebooks on PR from forks
    
    * Updating notebooks
    
    * Ensure LLaMA only runs when HF_TOKEN is available
    
    ---------
    
    Co-authored-by: jlarson4 <jonahalarson@comcast.net>
    
    * Add Qwen3.5 support and improve adapter validation
    
    - Document Qwen3.5 text-only model usage in special_cases.md
    - Update pyproject.toml to include transformers dependency for Qwen3.5
    - Enhance unit tests for Qwen3.5 architecture detection and dependency handling
    - Modify transformers.py to use prepared model config
    - Implement stricter validation in Qwen3_5ArchitectureAdapter for model compatibility
    
    * update lock file
    
    * Declare packaging for Qwen3.5 extra
    
    * Fix Qwen3.5 format and Mamba cache typing
    
    * Fix bridge component compatibility checks
    
    ---------
    
    Co-authored-by: Brendan Long <self@brendanlong.com>
    Co-authored-by: jlarson4 <jonahalarson@comcast.net>
    3 people authored May 19, 2026
    Configuration menu
    Copy the full SHA
    e359d7c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5fc97cc View commit details
    Browse the repository at this point in the history

Commits on May 20, 2026

  1. Feat/external architecture registration (#1307)

    * feat: External architecture adapter registration and entry-point discovery
    
    * docs: External adapter registration guide with examples
    
    * test: ArchitectureAdapterFactory registration, selection, and entry-point tests
    
    * fix: Clarify matching requirement and add warnings for failed discovery
    
    * fix: Make register_adapter doctest self-contained with inline adapter class
    
    * fix: Address Copilot review - alias, loop isolation, test state leak
    
    * fix: Apply black formatting to factory file
    
    * fix: use select_architecture_adapter in register_adapter doctest
    
    * fix: guard native adapter override in discover_entry_points
    
    * test: drop TestSupportedArchitectures
    
    * test: add entry-point discovery tests
    
    * docs: document native adapter override prevention
    
    * fix: apply black formatting to test file
    
    * fix: handle None ep.dist in entry-point guard warning
    
    * fix: apply black formatting to factory file
    huseyincavusbi authored May 20, 2026
    Configuration menu
    Copy the full SHA
    a152ccc View commit details
    Browse the repository at this point in the history
  2. Transformers v5 Gemma scaling adjustment (#1315)

    * Fixed double scaling of gemma, bump transformers pin to a 5.x version
    
    * testing and formatting
    jlarson4 authored May 20, 2026
    Configuration menu
    Copy the full SHA
    8e8d9d4 View commit details
    Browse the repository at this point in the history
Loading