Skip to content

Conversation

@mivanit
Copy link
Contributor

@mivanit mivanit commented Oct 29, 2024

Ready for review, requires HF_TOKEN to allow access to certain gated mistral models

Description

This PR ports features from my transformerlens-model-table repo to TransformerLens, implementing most of the features requested in #97. I still need some feedback on this, and presumably building docs will fail for one reason or another once I make the PR.

Features:

The static table has a few more fields added to it, but the primary focus is the interactive table. This provides:

  • information on parallel attn/mlps, positional embeddings, and other config elements
  • filtering and searching on any column (i.e. sort by parameter count and only show standard positional embeddings)
  • links back to the huggingface model page, where applicable (extracted from the "official model name")
  • tokenizer information, including vocab hash (need feedback on if there is a better way to do this)
  • full config in title text or new window
  • organized view of dimensions of all tensors in state dict and activation cache (via setting device to meta, doesn't require actually loading models)

Adds dependencies

under group docs:

  • tiktoken for dealing with certain tokenizers
  • muutils for pretty-printed data on tensor shapes

Type of change

  • This change requires a documentation update

Screenshots

Before:

Original model properties table

image

After (static):

You can see what the generated data looks like here

image

After (interactive):

See demo

tl-new

image

Checklist:

(currently draft PR, testing incomplete)

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings (some warnings occur during making the model table, but these can be ignored)
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes (NOTE: see this comment for information about the failing docs build, it's due to no access to mistral models. deleting a line will be necessary once this is fixed)
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

Notes:

@mivanit mivanit marked this pull request as draft October 29, 2024 18:29
@mivanit mivanit force-pushed the improve-model-properties-table-docs branch from 61c245a to c086f8c Compare October 29, 2024 18:38
@mivanit
Copy link
Contributor Author

mivanit commented Oct 29, 2024

note: when trying to run the tests with no changes from dev, docs build fails due to missing access to mistralai/Mistral-7B-v0.1 -- I assume whoever controls the token in the secrets needs to request access.

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-v0.1.
401 Client Error. (Request ID: Root=1-67212e07-3a24d32224b0b3[92](https://github.com/TransformerLensOrg/TransformerLens/actions/runs/11580521444/job/32239207982#step:8:93)5392aeab;e3e4f0e1-ceaa-442c-baa3-29300046cde8)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/config.json.
Access to model mistralai/Mistral-7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.

@mivanit
Copy link
Contributor Author

mivanit commented Oct 29, 2024

currently it seems like everything should work (still waiting on the action to finish), except:

  1. the HF_TOKEN provided in the CI environment does not allow access to some mistral models: mistral-7b, mistral-7b-instruct, mistral-nemo-base-2407, mixtral, mixtral-instruct. See below, or the failed action here
Error Details

ValueError: Failed to get model info for 5/190 models: {'mistral-7b': OSError('You are trying to access a gated repo.\nMake sure to have access to it at [https://huggingface.co/mistralai/Mistral-7B-v0.1.\n401](https://huggingface.co/mistralai/Mistral-7B-v0.1./n401) Client Error. (Request ID: Root=1-6721425d-7e4a7c39481e17cd0c99bacc;3d2ced02-fc5a-49fa-9767-dbb319c3aa11)\n\nCannot access gated repo for url [https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/config.json.\nAccess](https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/config.json./nAccess) to model mistralai/Mistral-7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.'), 'mistral-7b-instruct': OSError('You are trying to access a gated repo.\nMake sure to have access to it at [https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1.\n401](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1./n401) Client Error. (Request ID: Root=1-6721425d-10c0c4442b88a3090f42875c;b6fc9a41-a2f6-48ff-a04b-95e3121141da)\n\nCannot access gated repo for url [https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/resolve/main/config.json.\nAccess](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/resolve/main/config.json./nAccess) to model mistralai/Mistral-7B-Instruct-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.'), 'mistral-nemo-base-2407': OSError('You are trying to access a gated repo.\nMake sure to have access to it at [https://huggingface.co/mistralai/Mistral-Nemo-Base-2407.\n401](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407./n401) Client Error. (Request ID: Root=1-6721425d-130618ab049ef939252b6a77;4e85a3d4-923f-40d2-a827-32ae6c10d6cc)\n\nCannot access gated repo for url [https://huggingface.co/mistralai/Mistral-Nemo-Base-2407/resolve/main/config.json.\nAccess](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407/resolve/main/config.json./nAccess) to model mistralai/Mistral-Nemo-Base-2407 is restricted. You must have access to it and be authenticated to access it. Please log in.'), 'mixtral': OSError('You are trying to access a gated repo.\nMake sure to have access to it at [https://huggingface.co/mistralai/Mixtral-8x7B-v0.1.\n401](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1./n401) Client Error. (Request ID: Root=1-6721425d-2f78a8cf10acb3776a31fed7;3cb191e0-3498-42eb-8bfe-d04d3dbdfb68)\n\nCannot access gated repo for url [https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/resolve/main/config.json.\nAccess](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/resolve/main/config.json./nAccess) to model mistralai/Mixtral-8x7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.'), 'mixtral-instruct': OSError('You are trying to access a gated repo.\nMake sure to have access to it at [https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1.\n401](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1./n401) Client Error. (Request ID: Root=1-6721425d-2a9f8aa73858bd7f6c64a923;4367ea82-8131-4b9d-b439-186d13b52bd5)\n\nCannot access gated repo for url [https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json.\nAccess](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json./nAccess) to model mistralai/Mixtral-8x7B-Instruct-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.')} 'mistral-7b': You are trying to access a gated repo. Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-v0.1. 401 Client Error. (Request ID: Root=1-6721425d-7e4a7c39481e17cd0c99bacc;3d2ced02-fc5a-49fa-9767-dbb319c3aa11)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/config.json.
Access to model mistralai/Mistral-7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.
'mistral-7b-instruct': You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1.
401 Client Error. (Request ID: Root=1-6721425d-10c0c4442b88a3090f42875c;b6fc9a41-a2f6-48ff-a04b-95e3121141da)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/resolve/main/config.json.
Access to model mistralai/Mistral-7B-Instruct-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.
'mistral-nemo-base-2407': You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-Nemo-Base-2407.
401 Client Error. (Request ID: Root=1-6721425d-130618ab049ef939252b6a77;4e85a3d4-923f-40d2-a827-32ae6c10d6cc)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-Nemo-Base-2407/resolve/main/config.json.
Access to model mistralai/Mistral-Nemo-Base-2407 is restricted. You must have access to it and be authenticated to access it. Please log in.
'mixtral': You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mixtral-8x7B-v0.1.
401 Client Error. (Request ID: Root=1-6721425d-2f78a8cf10acb3776a31fed7;3cb191e0-3498-42eb-8bfe-d04d3dbdfb68)

Cannot access gated repo for url https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/resolve/main/config.json.
Access to model mistralai/Mixtral-8x7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.
'mixtral-instruct': You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1.
401 Client Error. (Request ID: Root=1-6721425d-2a9f8aa73858bd7f6c64a923;4367ea82-8131-4b9d-b439-186d13b52bd5)

Cannot access gated repo for url https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json.
Access to model mistralai/Mixtral-8x7B-Instruct-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.

For the time being, I have set allow_except=True, which allows the model table to generate even if getting some configs fails:

get_model_table(
    model_table_path=GENERATED_DIR / "model_table.jsonl",
    force_reload=True,
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    allow_except=True,  # TEMPORARY: until HF_TOKEN in secrets allows access to models:
    # mistral-7b mistral-7b-instruct mistral-nemo-base-2407 mixtral mixtral-instruct
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)

everything between the ~~~~~ should be deleted once the HF_TOKEN has access to the mistral models.

  1. how the existing markdown-only table links to the interactive one:
# Model Properties Table
also see the [interactive model table](../_static/model_properties_table_interactive.html)

I am not sure what the best practice is here for where to put the new table and how to link to it. I do think it's worth keeping the static markdown table because it loads much faster and doesn't rely on JS to work.

@mivanit mivanit marked this pull request as ready for review October 29, 2024 20:33
@bryce13950
Copy link
Collaborator

That is very odd. The token in there should be mine, and I should have access to all of those models. Can you remove your change that allows it to pass those models? I will take a look and investigate further this coming week.

@mivanit
Copy link
Contributor Author

mivanit commented Nov 4, 2024

That is very odd. The token in there should be mine, and I should have access to all of those models. Can you remove your change that allows it to pass those models? I will take a look and investigate further this coming week.

dcfb6e8 passes all tests and doc building (with mistral models being skipped) and
c710fc7 is the same, but with allow_except=False. Thanks for taking a look!

…erLens into branch on mivanit fork

ran `poetry update` due to `poetry.lock` conflict
@mivanit
Copy link
Contributor Author

mivanit commented Dec 16, 2024

I'm getting a bunch of errors of the form

ERROR transformer_lens/some_file.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

I think I may have messed up the dependencies by running poetry update -- I'm not quite sure what to do from here, I switched to uv a while ago partially because of problems like this

managed to fix this by resetting lockfile to version from main and running poetry lock --no-update

@mivanit
Copy link
Contributor Author

mivanit commented Dec 16, 2024

currently it seems like everything should work (still waiting on the action to finish), except:

  1. the HF_TOKEN provided in the CI environment does not allow access to some mistral models: mistral-7b, mistral-7b-instruct, mistral-nemo-base-2407, mixtral, mixtral-instruct. See below, or the failed action here
    (...)

I'm still getting failures on these 5 models, that appears to be unchanged. I'm also getting "soft" failures on a variety of models, where the function is also unable to use the HF token.

Could it be the case that as someone who isn't an authorized contributor to the repo, the repo secrets are not provided when I initialize a github action? This would make sense since otherwise any random person could submit a PR that prints the secrets and thus steal them.

@bryce13950 bryce13950 merged commit 1ff356b into TransformerLensOrg:dev Jul 8, 2025
36 of 39 checks passed
bryce13950 added a commit that referenced this pull request Oct 16, 2025
* Update README.md (#957)

Update link to Streamlit tutorial and guide.

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* improve model properties table in docs (#769)

* add static to gitignore

* making a meaningless change to see if tests pass at all

* making a meaningless change to see if tests pass at all

* add interactive table static html only

adding things one at a time to see what causes things to break

* run poetry update with no changes to deps

* revert lockfile change

* add tiktoken >=0.7.0 to group docs

* add dep muutils >=0.6.15 to group docs

* add improved interactive table generation

we still generate a plain markdown table

code is from the old PR: https://github.com/mivanit/TransformerLens/blob/add-better-model-properties-table/docs/make_docs.py
which is in turn a modified version of https://github.com/mivanit/transformerlens-model-table

* fix format -- missing trailing newline

* fix type hints for compatibility

* fix torch device meta in make docs script, also improved hot reload

* TEMPORARY: allow_except when getting models to deal with mixtral HF_TOKEN issue

* added simple test for get_model_info

* context manager for controlling device, tests were breaking due to default device meta

* formatted with wrong version of black, oops

* fix path to generated model_properties_table

* fix md table header, add title in yaml frontmatter

* add line to frontmatter yaml, re-run tests bc huggingface down?

* do not allow exceptions when getting models

* re-run poetry lock

* attempt fix lockfile

* re-run poetry lock

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* switch pyproject from toml to uv, generate lockfile

also update tiktoken dep for 3.13 compatibility

* update makefile to use uv

* update actions

* hack to get version to work

* wip

* make dep

* update contributing.md to reflect switch from poetry to uv

* add type hints to supported_models

* fix paths in make_docs.py

* docs group not in default, update install instructions for docs

* POETRY_PYPI_TOKEN_PYPI -> PYPI_TOKEN_PYPI

* make format

* fix default groups, re-add docs

* add some deps needed in notebooks

* removed use of torchtyping in othello_GPT.ipynb and deps

- torchtyping causes various issues if it's imported
- presumably jaxtyping should be used instead??
- othello GPT notebook doesn't actually use the imported TT
  - shouldn't a linter/formatter catch this sort of unused import?

* fix: add pythonpath "." to pytest config for test imports

Configure pytest to include project root in Python path, enabling
`from tests.foo import bar`
style imports, which were broken by switching to uv

* attempt jupyter issue fix

* issue ref explaining ipython version restriction

* updated ci commands after recent work

* fixed more setup items

* added tabulate dependency

* updated make docs command

* updated dependencies

* fixed docs

---------

Co-authored-by: jmole <jmoeller@gmail.com>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants