[Proposal] Add Nemotron-H hybrid Mamba2-Transformer adapter

### Proposal

Add a TransformerBridge adapter for `NemotronHForCausalLM` (NVIDIA Nemotron-H), a hybrid that mixes Mamba-2 and attention layers.

### Motivation

Nemotron-H keeps only a small fraction of attention layers (around 8%) and replaces the rest with Mamba-2. That makes the few attention layers a clean target for interpretability: researchers can ask what those layers do that the state-space layers cannot. The line has strong, ongoing NVIDIA releases and wide adoption, and it complements the existing Mamba and Mamba2 support.

We have a limited amount of support for Mamba layers, and working on this will open some new avenues to support possible work on those Mamba layers as well.

Gap scan (2026-06-18): 53 models, ~4.99M downloads, the highest-ranked hybrid state-space gap.

### Pitch

Build on the existing Mamba2 components for the state-space layers and standard attention hooks for the interleaved attention layers. A tiny test checkpoint (`trl-internal-testing/tiny-NemotronHForCausalLM-nano`) keeps CI cheap.

- Claude Code users can scaffold with `/add-model-support nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16`.
- Register at the four sites listed in [contributing.md](../docs/source/content/contributing.md).
- Verify smallest-first: `trl-internal-testing/tiny-NemotronHForCausalLM-nano`, then `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16`.

### Additional context

- Paper: https://arxiv.org/abs/2504.03624
- Checkpoint: https://huggingface.co/nvidia/Nemotron-H-4B-Base-8K
- Found via the `hf_scraper` architecture-gaps pass (2026-06-18).

### Checklist

- [x] I have checked that there is no similar [issue](https://github.com/TransformerLensOrg/TransformerLens/issues) in the repo (**required**)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Add Nemotron-H hybrid Mamba2-Transformer adapter #1402

Proposal

Motivation

Pitch

Additional context

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Proposal] Add Nemotron-H hybrid Mamba2-Transformer adapter #1402

Description

Proposal

Motivation

Pitch

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions