Skip to content

[WIP] Enable per-token grad collection#134

Open
luciaquirke wants to merge 7 commits intomainfrom
tokens
Open

[WIP] Enable per-token grad collection#134
luciaquirke wants to merge 7 commits intomainfrom
tokens

Conversation

@luciaquirke
Copy link
Collaborator

@luciaquirke luciaquirke commented Feb 2, 2026

  • gradient_collectors update is AI-generated and needs review
  • data.py has too much content and needs decomposition
  • TODO enable in-memory per-token gradient collection
    • See if we can have less code replication in the gradient collectors, document findings for follow up work
  • otherwise ready for testing

@claude
Copy link

claude bot commented Feb 2, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

1 similar comment
@claude
Copy link

claude bot commented Feb 2, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@luciaquirke luciaquirke force-pushed the tokens branch 3 times, most recently from 226ff18 to 07d8441 Compare February 3, 2026 08:12
@LouisYRYJ
Copy link
Contributor

Looking forward to token filtering and think this is pretty important - have use cases for it!
Would it be possible to split this into multiple PRs to distinguish the new token features from general restructuring?

@luciaquirke
Copy link
Collaborator Author

Looking forward to token filtering and think this is pretty important - have use cases for it! Would it be possible to split this into multiple PRs to distinguish the new token features from general restructuring?

Yep will do

@luciaquirke luciaquirke force-pushed the tokens branch 8 times, most recently from 1d64cd9 to e4011d0 Compare February 9, 2026 07:26
…g, clean up configs

- Rewrite auto_batch_size to use simpler halving strategy instead of complex binary search
- Simplify allocate_batches by removing _allocate_batches_world indirection
- Remove unused hessian-related config fields and collector hooks
- Add load_from_disk support for Arrow datasets in data pipeline
- Clean up imports, logging, docs, and CLI help text
- Remove deprecated math utilities and unused score_writer functionality

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@luciaquirke luciaquirke force-pushed the tokens branch 6 times, most recently from b562ada to 641b641 Compare February 9, 2026 10:26
@luciaquirke
Copy link
Collaborator Author

@LouisYRYJ cleaned up

@LouisYRYJ
Copy link
Contributor

sweet, will take a look today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants