Skip to content

Latest commit

 

History

History
1267 lines (622 loc) · 58.5 KB

CHANGELOG.md

File metadata and controls

1267 lines (622 loc) · 58.5 KB

CHANGELOG

v2.1.0 (2024-05-11)

Chore

  • chore: remove use_deterministic_algorithms=True since it causes cuda errors (#137) (1a3bedb)

Feature

  • feat: Hooked toy model (#134)

  • adds initial re-implementation of toy models

  • removes instance dimension from toy models

  • fixing up minor nits and adding more tests


Co-authored-by: David Chanin <chanindav@gmail.com> (03aa25c)

v2.0.0 (2024-05-10)

Breaking

  • feat: rename batch sizes to give informative units (#133)

BREAKING CHANGE: renamed batch sizing config params

  • renaming batch sizes to give units

  • changes in notebooks

  • missed one!


Co-authored-by: David Chanin <chanindav@gmail.com> (cc78e27)

Chore

  • chore: tools to make tests more deterministic (#132) (2071d09)

  • chore: Make tutorial notebooks work in Google Colab (#120)

Co-authored-by: David Chanin <chanindav@gmail.com> (007141e)

v1.8.0 (2024-05-09)

Chore

  • chore: closing " in docs (#130) (5154d29)

Feature

  • feat: Add model_from_pretrained_kwargs as config parameter (#122)

  • add model_from_pretrained_kwargs config parameter to allow full control over model used to extract activations from. Update tests to cover new cases

  • tweaking test style


Co-authored-by: David Chanin <chanindav@gmail.com> (094b1e8)

v1.7.0 (2024-05-08)

Feature

  • feat: Add torch compile (#129)

  • Surface # of eval batches and # of eval sequences

  • fix formatting

  • config changes

  • add compilation to lm_runner.py

  • remove accidental print statement

  • formatting fix (5c41336)

  • feat: Change eval batch size (#128)

  • Surface # of eval batches and # of eval sequences

  • fix formatting

  • fix print statement accidentally left in (758a50b)

v1.6.1 (2024-05-07)

Fix

  • fix: Revert "feat: Add kl eval (#124)" (#127)

This reverts commit c1d9cbe8627f27f4d5384ed4c9438c3ad350d412. (1a0619c)

v1.6.0 (2024-05-07)

Feature

  • feat: Add bf16 autocast (#126)

  • add bf16 autocast and gradient scaling

  • simplify autocast setup

  • remove completed TODO

  • add autocast dtype selection (generally keep bf16)

  • formatting fix

  • remove autocast dtype (8e28bfb)

v1.5.0 (2024-05-07)

Feature

  • feat: Add kl eval (#124)

  • add kl divergence to evals.py

  • fix linter (c1d9cbe)

Unknown

  • major: How we train saes replication (#123)

  • l1 scheduler, clip grad norm

  • add provisional ability to normalize activations

  • notebook

  • change heuristic norm init to constant, report b_e and W_dec norms (fix tests later)

  • fix mse calculation

  • add benchmark test

  • update heuristic init to 0.1

  • make tests pass device issue

  • continue rebase

  • use better args in benchmark

  • remove stack in get activations

  • broken! improve CA runner

  • get cache activation runner working and add some tests

  • add training steps to path

  • avoid ghost grad tensor casting

  • enable download of full dataset if desired

  • add benchmark for cache activation runner

  • add updated tutorial

  • format


Co-authored-by: Johnny Lin <hijohnnylin@gmail.com> (5f46329)

v1.4.0 (2024-05-05)

Feature

  • feat: Store state to allow resuming a run (#106)

  • first pass of saving

  • added runner resume code

  • added auto detect most recent checkpoint code

  • make linter happy (and one small bug)

  • blak code formatting

  • isort

  • help pyright

  • black reformatting:

  • activations store flake

  • pyright typing

  • black code formatting

  • added test for saving and loading

  • bigger training set

  • black code

  • move to pickle

  • use pickle because safetensors doesn't support all the stuff needed for optimizer and scheduler state

  • added resume test

  • added wandb_id for resuming

  • use wandb id for checkpoint

  • moved loaded to device and minor fixes to resuming


Co-authored-by: David Chanin <chanindav@gmail.com> (4d12e7a)

Unknown

  • Fix: sparsity norm calculated at incorrect dimension. (#119)

  • Fix: sparsity norm calculated at incorrect dimension.

For L1 this does not effect anything as essentially it's calculating the abs() and average everything. For L2 this is problematic as L2 involves sum and sqrt. Unexpected behaviors occur when x is of shape (batch, sen_length, hidden_dim).

  • Added tests.

  • Changed sparsity calculation to handle 3d inputs. (ce95fb2)

v1.3.0 (2024-05-03)

Feature

  • feat: add activation bins for neuronpedia outputs, and allow customizing quantiles (#113) (05d650d)

  • feat: Update for Neuropedia auto-interp (#112)

  • cleanup Neuronpedia autointerp code

  • Fix logic bug with OpenAI key


Co-authored-by: Joseph Bloom <69127271+jbloomAus@users.noreply.github.com> (033283d)

  • feat: SparseAutoencoder.from_pretrained() similar to transformer lens (#111)

  • add partial work so David can continue

  • feat: adding a SparseAutoencoder.from_pretrained() function


Co-authored-by: jbloomaus <jbloomaus@gmail.com> (617d416)

Fix

  • fix: replace list_files_info with list_repo_tree (#117) (676062c)

  • fix: Improved activation initialization, fix using argument to pass in API key (#116) (7047bcc)

v1.2.0 (2024-04-29)

Feature

  • feat: breaks up SAE.forward() into encode() and decode() (#107)

  • breaks up SAE.forward() into encode() and decode()

  • cleans up return typing of encode by splitting into a hidden and public function (7b4311b)

v1.1.0 (2024-04-29)

Feature

  • feat: API for generating autointerp + scoring for neuronpedia (#108)

  • API for generating autointerp for neuronpedia

  • Undo pytest vscode setting change

  • Fix autointerp import

  • Use pypi import for automated-interpretability (7c43c4c)

v1.0.0 (2024-04-27)

Breaking

  • chore: empty commit to bump release

BREAKING CHANGE: v1 release (2615a3e)

Chore

  • chore: fix outdated lr_scheduler_name in docs (#109)

  • chore: fix outdated lr_scheduler_name in docs

  • add tutorial hparams (7cba332)

Unknown

  • BREAKING CHANGE: 1.0.0 release

BREAKING CHANGE: 1.0.0 release (c23098f)

  • Neuronpedia: allow resuming upload (#102) (0184671)

v0.7.0 (2024-04-24)

Feature

  • feat: make a neuronpedia list with features via api call (#101) (23e680d)

Unknown

  • Merge pull request #100 from jbloomAus/np_improvements

Improvements to Neuronpedia Runner (5118f7f)

  • neuronpedia: save run settings to json file to avoid errors when resuming later. automatically skip batch files that already exist (4b5412b)

  • skip batch file if it already exists (7d0e396)

  • neuronpedia: include log sparsity threshold in skipped_indexes.json (5c967e7)

v0.6.0 (2024-04-21)

Chore

  • chore: enabling pythong 3.12 checks for CI (25526ea)

  • chore: setting up precommit to be consistent with CI (18e706d)

Feature

  • feat: Added tanh-relu activation fn and input noise options (#77)

  • Still need to pip-install from GitHub hufy implementation.

  • Added support for tanh_sae.

  • Added notebook for loading the tanh_sae

  • tweaking config options to be more declarating / composable

  • testing adding noise to SAE forward pass

  • updating notebook


Co-authored-by: David Chanin <chanindav@gmail.com> (551e94d)

Unknown

  • Update proposal.md (6d45b33)

  • Merge pull request #96 from jbloomAus/github-templates

add templates for PR's / issues (241a201)

  • add templates for PR's / issues (74ff597)

  • Merge pull request #95 from jbloomAus/load-state-dict-not-strict

Make load_state_dict use strict=False (4a9e274)

  • fix accidental bug (c22fbbd)

  • fix load pretrained legacy with state dict change (b5e97f8)

  • Make load_state_dict use strict=False (fdf7fe9)

  • Merge pull request #94 from jbloomAus/update-pre-commit

chore: setting up precommit to be consistent with CI (6a056b7)

  • Merge pull request #87 from evanhanders/old_to_new

Adds function that converts old .pt pretrained SAEs to new folder format (1cb1725)

  • Merge pull request #93 from jbloomAus/py-312-ci

chore: enabling python 3.12 checks for CI (87be422)

v0.5.1 (2024-04-19)

Chore

  • chore: re-enabling isort in CI (#86) (9c44731)

Fix

  • fix: pin pyzmq==26.0.1 temporarily (0094021)

  • fix: typing issue, temporary (25cebf1)

Unknown

  • v0.5.1 (0ac218b)

  • fixes string vs path typing errors (94f1fc1)

  • removes unused import (06406b0)

  • updates formatting for alignment with repo standards. (5e1f342)

  • consolidates with SAE class load_legacy function & adds test (0f85ded)

  • adds old->new file conversion function (fda2b57)

  • Merge pull request #91 from jbloomAus/decoder-fine-tuning

Decoder fine tuning (1fc652c)

  • par update (2bb5975)

  • Merge pull request #89 from jbloomAus/fix_np

Enhance + Fix Neuronpedia generation / upload (38d507c)

  • minor changes (bc766e4)

  • reformat run.ipynb (822882c)

  • get decoder fine tuning working (11a71e1)

  • format (040676d)

  • Merge pull request #88 from jbloomAus/get_feature_from_neuronpedia

FEAT: Add API for getting Neuronpedia feature (1666a68)

  • Fix resuming from batch (145a407)

  • Use original repo for sae_vis (1a7d636)

  • Use correct model name for np runner (138d5d4)

  • Merge main, remove eindex (6578436)

  • Add API for getting Neuronpedia feature (e78207d)

v0.5.0 (2024-04-17)

Feature

  • feat: Mamba support vs mamba-lens (#79)

  • mamba support

  • added init

  • added optional model kwargs

  • Support transformers and mamba

  • forgot one model kwargs

  • failed opts

  • tokens input

  • hack to fix tokens, will look into fixing mambalens

  • fixed checkpoint

  • added sae group

  • removed some comments and fixed merge error

  • removed unneeded params since that issue is fixed in mambalens now

  • Unneded input param

  • removed debug checkpoing and eval

  • added refs to hookedrootmodule

  • feed linter

  • added example and fixed loading

  • made layer for eval change

  • fix linter issues

  • adding mamba-lens as optional dep, and fixing typing/linting

  • adding a test for loading mamba model

  • adding mamba-lens to dev for CI

  • updating min mamba-lens version

  • updating mamba-lens version


Co-authored-by: David Chanin <chanindav@gmail.com> (eea7db4)

Unknown

  • update readme (440df7b)

  • update readme (3694fd2)

  • Fix upload skipped/dead features (932f380)

  • Use python typer instead of shell script for neuronpedia jobs (b611e72)

  • Merge branch 'main' into fix_np (cc6cb6a)

  • convert sparsity to log sparsity if needed (8d7d404)

v0.4.0 (2024-04-16)

Feature

  • feat: support orthogonal decoder init and no pre-decoder bias (ac606a3)

Fix

  • fix: sae dict bug (484163e)

  • fix: session loader wasn't working (a928d7e)

Unknown

  • enable setting adam pars in config (1e53ede)

  • fix sae dict loader and format (c558849)

  • default orthogonal init false (a8b0113)

  • Formatting (1e3d53e)

  • Eindex required by sae_vis (f769e7a)

  • Upload dead feature stubs (9067380)

  • Make feature sparsity an argument (8230570)

  • Fix buffer" (dde2481)

  • Merge branch 'main' into fix_np (6658392)

  • notebook update (feca408)

  • Merge branch 'main' into fix_np (f8fb3ef)

  • Final fixes (e87788d)

  • Don't use buffer, fix anomalies (2c9ca64)

v0.3.0 (2024-04-15)

Feature

  • feat: add basic tutorial for training saes (1847280)

v0.2.2 (2024-04-15)

Fix

  • fix: dense batch dim mse norm optional (8018bc9)

Unknown

  • format (c359c27)

  • make dense_batch_mse_normalization optional (c41774e)

  • Runner is fixed, faster, cleaned up, and now gives whole sequences instead of buffer. (3837884)

  • Merge branch 'main' into fix_np (3ed30cf)

  • add warning in run script (9a772ca)

  • update sae loading code (356a8ef)

  • add device override to session loader (96b1e12)

  • update readme (5cd5652)

v0.2.1 (2024-04-13)

Fix

  • fix: neuronpedia quicklist (6769466)

v0.2.0 (2024-04-13)

Chore

  • chore: improving CI speed (9e3863c)

  • chore: updating README.md with pip install instructions and PyPI badge (682db80)

Feature

  • feat: overhaul saving and loading (004e8f6)

Unknown

  • Use legacy loader, add back histograms, logits. Fix anomaly characters. (ebbb622)

  • Merge branch 'main' into fix_np (586e088)

  • Merge pull request #80 from wllgrnt/will-update-tutorial

bugfix - minimum viable updates to tutorial notebook (e51016b)

  • minimum viable fixes to evaluation notebook (b907567)

  • Merge pull request #76 from jbloomAus/faster-ci

perf: improving CI speed (8b00000)

  • try partial cache restore (392f982)

  • Merge branch 'main' into faster-ci (89e1568)

  • Merge pull request #78 from jbloomAus/fix-artifact-saving-loading

Fix artifact saving loading (8784c74)

  • remove duplicate code (6ed6af5)

  • set device in load from pretrained (b4e12cd)

  • fix typing issue which required ignore (a5df8b0)

  • remove print statement (295e0e4)

  • remove load with session option (74926e1)

  • fix broken test (16935ef)

  • avoid tqdm repeating during training (1d70af8)

  • avoid division by 0 (2c7c6d8)

  • remove old notebook (e1ad1aa)

  • use-sae-dict-not-group (27f8003)

  • formatting (827abd0)

  • improve artifact loading storage, tutorial forthcoming (604f102)

  • add safetensors to project (0da48b0)

  • Don't precompute background colors and tick values (271dbf0)

  • Merge pull request #71 from weissercn/main

Addressing notebook issues (8417505)

  • Merge pull request #70 from jbloomAus/update-readme-install

chore: updating README.md with pip install instructions and PyPI badge (4d7d1e7)

  • FIX: Add back correlated neurons, frac_nonzero (d532b82)

  • linting (1db0b5a)

  • fixed graph name (ace4813)

  • changed key for df_enrichment_scores, so it can be run (f0a9d0b)

  • fixed space in notebook 2 (2278419)

  • fixed space in notebook 2 (24a6696)

  • fixed space in notebook (d2f8c8e)

  • fixed pickle backwards compatibility in tutorial (3a97a04)

v0.1.0 (2024-04-06)

Feature

Fix

  • fix: removing paths-ignore from action to avoid blocking releases (28ff797)

  • fix: updating saevis version to use pypi (dbd96a2)

Unknown

  • Merge pull request #69 from chanind/remove-ci-ignore

fix: removing paths-ignore from action to avoid blocking releases (179cea1)

  • Update README.md (1720ce8)

  • Merge pull request #68 from chanind/updating-sae-vis

fix: hotfix updating saevis version to use pypi (a13cee3)

v0.0.0 (2024-04-06)

Chore

  • chore: adding more tests to ActivationsStore + light refactoring (cc9899c)

  • chore: running isort to fix imports (53853b9)

  • chore: setting up pyright type checking and fixing typing errors (351995c)

  • chore: enable full flake8 default rules list (19886e2)

  • chore: using poetry for dependency management (465e003)

  • chore: removing .DS_Store files (32f09b6)

Unknown

  • Merge pull request #66 from chanind/pypi

feat: setting up sae_lens package and auto-deploy with semantic-release (34633e8)

  • Merge branch 'main' into pypi (3ce7f99)

  • Merge pull request #60 from chanind/improve-config-typing

fixing config typing (b8fba4f)

  • setting up sae_lens package and auto-deploy with semantic-release (ba41f32)

  • fixing config typing

switch to using explicit params for ActivationsStore config instead of RunnerConfig base class (9be3445)

  • Merge pull request #65 from chanind/fix-forgotten-scheduler-opts

passing accidentally overlooked scheduler opts (773bc02)

  • passing accidentally overlooked scheduler opts (ad089b7)

  • Merge pull request #64 from chanind/lr-decay

adding lr_decay_steps and refactoring get_scheduler (c960d99)

  • adding lr_decay_steps and refactoring get_scheduler (fd5448c)

  • Merge pull request #53 from hijohnnylin/neuronpedia_runner

Generate and upload Neuronpedia artifacts (0b94f84)

  • format (792c7cb)

  • ignore type incorrectness in imported package (5fe83a9)

  • Merge pull request #63 from chanind/remove-eindex

removing unused eindex depencency (1ce44d7)

  • removing unused eindex depencency (7cf991b)

  • Safe to_str_tokens, fix memory issues (901b888)

  • Allow starting neuronpedia generation at a specific batch numbe (85d8f57)

  • FIX: Linting 'do not use except' (ce3d40c)

  • Fix vocab: Ċ should be line break. Also set left and right buffers (205b1c1)

  • Merge (b159010)

  • Update Neuronpedia Runner (885de27)

  • Merge pull request #58 from canrager/main

Make prepend BOS optional: Default True (48a07f9)

  • make tests pass with use_bos flag (618d4bb)

  • Merge pull request #59 from chanind/fix-docs-deploy

attempting to fix docs deploy (cfafbe7)

Adding tests to get_scheduler (13c8085)

  • Merge pull request #56 from chanind/sae-tests

minor refactoring to SAE and adding tests (2c425ca)

  • minor refactoring to SAE and adding tests (92a98dd)

  • adding tests to get_scheduler (3b7e173)

  • Generate and upload Neuronpedia artifacts (b52e0e2)

  • Merge pull request #54 from jbloomAus/hook_z_suppourt

notional support, needs more thorough testing (277f35b)

  • Merge pull request #55 from chanind/contributing-docs

adding a contribution guide to docs (8ac8f05)

  • adding a contribution guide to docs (693c5b3)

  • notional support, needs more thorough testing (9585022)

  • Generate and upload Neuronpedia artifacts (4540268)

  • Merge pull request #52 from hijohnnylin/fix_db_runner_assert

FIX: Don't check wandb assert if not using wandb (5c48811)

  • FIX: Don't check wandb assert if not using wandb (1adefda)

  • add docs badge (f623ed1)

  • try to get correct deployment (777dd6c)

  • Merge pull request #51 from jbloomAus/mkdocs

Add Docs to the project. (d2ebbd7)

  • mkdocs, test (9f14250)

  • code cov (2ae6224)

  • Merge pull request #48 from chanind/fix-sae-vis-version

Pin sae_vis to previous working version (3f8a30b)

  • fix suffix issue (209ba13)

  • pin sae_vis to previous working version (ae0002a)

  • don't ignore changes to .github (35fdeec)

  • add cov report (971d497)

  • Merge pull request #40 from chanind/refactor-train-sae

Refactor train SAE and adding unit tests (5aa0b11)

  • Merge branch 'main' into refactor-train-sae (0acdcb3)

  • Merge pull request #41 from jbloomAus/move_to_sae_vis

Move to sae vis (bcb9a52)

  • flake8 can ignore imports, we're using isort anyway (6b7ae72)

  • format (af680e2)

  • fix mps bug (e7b238f)

  • more tests (01978e6)

  • wip (4c03b3d)

  • more tests (7c1cb6b)

  • testing that sparsity counts get updated correctly (5b5d653)

  • adding some unit tests to _train_step() (dbf3f01)

  • Merge branch 'main' into refactor-train-sae (2d5ec98)

  • Update README.md (d148b6a)

  • Merge pull request #20 from chanind/activations_store_tests

chore: adding more tests to ActivationsStore + light refactoring (69dcf8e)

  • Merge branch 'main' into activations_store_tests (4896d0a)

  • refactoring train_sae_on_language_model.py into smaller functions (e75a15d)

  • suppourt apollo pretokenized datasets (e814054)

  • handle saes saved before groups (5acd89b)

  • typechecker (fa6cc49)

  • fix geom median bug (8d4a080)

  • remove references to old code (861151f)

  • remove old geom median code (05e0aca)

  • Merge pull request #22 from themachinefan/faster_geometric_median

Faster geometric median. (341c49a)

  • makefile check type and types of geometric media (736bf83)

  • Merge pull request #21 from schmatz/fix-dashboard-image

Fix broken dashboard image on README (eb90cc9)

  • Merge pull request #24 from neelnanda-io/add-post-link

Added link to AF post (39f8d3d)

  • Added link to AF post (f0da9ea)

  • formatting (0168612)

  • use device, don't use cuda if not there (20334cb)

  • format (ce49658)

  • fix tsea typing (449d90f)

  • faster geometric median. Run geometric_median,py to test. (92cad26)

  • Fix dashboard image (6358862)

  • fix incorrect code used to avoid typing issue (ed0b0ea)

  • add nltk (bc7e276)

  • ignore various typing issues (6972c00)

  • add babe package (481069e)

  • make formatter happy (612c7c7)

  • share scatter so can link (9f88dc3)

  • add_analysis_files_for_post (e75323c)

  • don't block on isort linting (3949a46)

  • formatting (951a320)

  • Update README.md (b2478c1)

  • Merge pull request #18 from chanind/type-checking

chore: setting up pyright type checking and fixing typing errors (bd5fc43)

  • Merge branch 'main' into type-checking (57c4582)

  • Merge pull request #17 from Benw8888/sae_group_pr

SAE Group for sweeps PR (3e78bce)

  • Merge pull request #1 from chanind/sae_group_pr_isort_fix

chore: running isort to fix imports (dd24413)

  • black format (0ffcf21)

  • fixed expansion factor sweep (749b8cf)

  • remove tqdm from data loader, too noisy (de3b1a1)

  • fix tests (b3054b1)

  • don't calculate geom median unless you need to (d31bc31)

  • add to method (b3f6dc6)

  • flake8 and black (ed8345a)

  • flake8 linter changes (8e41e59)

  • Merge branch 'main' into sae_group_pr (082c813)

  • Delete evaluating.ipynb (d3cafa3)

  • Delete activation_storing.py (fa82992)

  • Delete lp_sae_training.py (0d1e1c9)

  • implemented SAE groups (66facfe)

  • Merge pull request #16 from chanind/flake-default-rules

chore: enable full flake8 default rules list (ad84706)

  • implemented sweeping via config list (80f61fa)

  • Merge pull request #13 from chanind/poetry

chore: using poetry for dependency management (496f7b4)

  • progress on implementing multi-sae support (2ba2131)

  • Merge pull request #11 from lucyfarnik/fix-caching-shuffle-edge-case

Fixed edge case in activation cache shuffling (3727b5d)

  • Merge pull request #12 from lucyfarnik/add-run-name-to-config

Added run name to config (c2e05c4)

  • Added run name to config (ab2aabd)

  • Fixed edge case in activation cache shuffling (18fd4a1)

  • Merge pull request #9 from chanind/rm-ds-store

chore: removing .DS_Store files (37771ce)

  • improve readmen (f3fe937)

  • fix_evals_bad_rebase (22e415d)

  • evals changes, incomplete (736c40e)

  • make tutorial independent of artefact and delete old artefact (6754e65)

  • fix MSE in ghost grad (44f7988)

  • Merge pull request #5 from jbloomAus/clean_up_repo

Add CI/CD, black formatting, pre-commit with flake8 linting. Fix some bugs. (01ccb92)

  • clean up run examples (9d46bdd)

  • move where we save the final artifact (f445fac)

  • fix activations store innefficiency (07d38a0)

  • black format and linting (479765b)

  • dummy file change (912a748)

  • try adding this branch listed specifically (7fd0e0c)

  • yml not yaml (9f3f1c8)

  • add ci (91aca91)

  • get units tests working (ade2976)

  • make unit tests pass, add make file (08b2c92)

  • add pytest-cov to requirements.txt (ce526df)

  • seperate research from main repo (32b668c)

  • remove comma and set default store batch size lower (9761b9a)

  • notebook for Johny (39a18f2)

  • best practices ghost grads fix (f554b16)

  • Update README.md

improved the hyperpars (2d4caf6)

  • dashboard runner (a511223)

  • readme update (c303c55)

  • still hadn't fixed the issue, now fixed (a36ee21)

  • fix mean of loss which broke in last commit (b4546db)

  • generate dashboards (35fa631)

  • Merge pull request #3 from jbloomAus/ghost_grads_dev

Ghost grads dev (4d150c2)

  • save final log sparsity (98e4f1b)

  • start saving log sparsity (4d6df6f)

  • get ghost grads working (e863ed7)

  • add notebook/changes for ghost-grad (not working yet) (73053c1)

  • idk, probs good (0407ad9)

  • bunch of shit (1ec8f97)

  • Merge branch 'main' of github.com:jbloomAus/mats_sae_training (a22d856)

  • Reverse engineering the "not only... but" feature (74d4fb8)

  • Merge pull request #2 from slavachalnev/no_reinit

Allow sampling method to be None (4c5fed8)

  • Allow sampling method to be None (166799d)

  • research/week_15th_jan/gpt2_small_resid_pre_3.ipynb (52a1da7)

  • add arg for dead neuron calc (ffb75fb)

  • notebooks for lucy (0319d89)

  • add args for b_dec_init (82da877)

  • add geom median as submodule instead (4c0d001)

  • add geom median to req (4c8ac9d)

  • add-geometric-mean-b_dec-init (d5853f8)

  • reset feature sparsity calculation (4c7f6f2)

  • anthropic sampling (048d267)

  • get anthropic resampling working (ca74543)

  • add ability to finetune existing autoencoder (c1208eb)

  • run notebook (879ad27)

  • switch to batch size independent loss metrics (0623d39)

  • track mean sparsity (75f1547)

  • don't stop early (44078a6)

  • name runs better (5041748)

  • improve-eval-metrics-for-attn (00d9b65)

  • add hook q (b061ee3)

  • add copy suppression notebook (1dc893a)

  • fix check in neuron sampling (809becd)

  • Merge pull request #1 from jbloomAus/activations_on_disk

Activations on disk (e5f198e)

  • merge into main (94ed3e6)

  • notebook (b5344a3)

  • various research notebooks (be63fce)

  • Added activations caching to run.ipynb (054cf6d)

  • Added activations dir to gitignore (c4a31ae)

  • Saving and loading activations from disk (309e2de)

  • Fixed typo that threw out half of activations (5f73918)

  • minor speed improvement (f7ea316)

  • add notebook with different example runs (c0eac0a)

  • add ability to train on attn heads (18cfaad)

  • add gzip for pt artefacts (9614a23)

  • add_example_feature_dashboard (e90e54d)

  • get_shit_done (ce73042)

  • commit_various_things_in_progress (3843c39)

  • add sae visualizer and tutorial (6f4030c)

  • make it possible to load sae trained on cuda onto mps (3298b75)

  • reduce hist freq, don't cap re-init (debcf0f)

  • add loader import to readme (b63f14e)

  • Update README.md (88f086b)

  • improve-resampling (a3072c2)

  • add readme (e9b8e56)

  • fixl0_plus_other_stuff (2f162f0)

  • add checkpoints (4cacbfc)

  • improve_model_saving_loading (f6697c6)

  • stuff (19d278a)

  • Added support for non-tokenized datasets (afcc239)

  • notebook_for_keith (d06e09b)

  • fix resampling bug (2b43980)

  • test pars (f601362)

  • further-lm-improvments (63048eb)

  • get_lm_working_well (eba5f79)

  • basic-lm-training-currently-broken (7396b8b)

  • set_up_lm_runner (d1095af)

  • fix old test, may remove (b407aab)

  • happy with hyperpars on benchmark (836298a)

  • improve metrics (f52c7bb)

  • make toy model runner (4851dd1)

  • various-changes-toy-model-test (a61b75f)

  • Added activation store and activation gathering (a85f24d)

  • First successful run on toy models (4927145)

  • halfway-to-toy-models (feeb411)

  • Initial commit (7a94b0e)