Skip to content
View RobertKirk's full-sized avatar

Highlights

  • Pro

Block or report RobertKirk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Jupyter Notebook 462 41 Updated Jan 15, 2025

Inference algorithms for models based on Luce's choice axiom

Jupyter Notebook 164 28 Updated Dec 4, 2024

Steering vectors for transformer language models in Pytorch / Huggingface

Python 81 7 Updated Nov 21, 2024

Code for the TinyStories experiments from "Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks".

Jupyter Notebook 5 1 Updated Dec 18, 2023

This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity

Python 40 6 Updated Jan 19, 2024

A browser extension that deletes your news feed and replaces it with a nice quote

TypeScript 1,234 287 Updated Nov 28, 2024

ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.

Python 1,402 135 Updated May 27, 2024

lesspipe - display more with less

Perl 497 51 Updated Jan 4, 2025

Adds vim keybindings to all OS X inputs

Lua 712 33 Updated Apr 11, 2023

A library for mechanistic interpretability of GPT-style language models

Python 1,768 319 Updated Jan 22, 2025

🎢 Creating and sharing simulation environments for embodied and synthetic data research

Python 190 13 Updated Oct 19, 2023

A modular RL library to fine-tune language models to human preferences

Python 2,257 192 Updated Mar 1, 2024

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,566 473 Updated Jan 8, 2024
Python 86 6 Updated Jun 1, 2023

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,088 793 Updated Aug 24, 2023

Mechanistic Interpretability for Transformer Models

Python 49 6 Updated Jun 1, 2022

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,269 163 Updated Jul 25, 2023

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,668 131 Updated Sep 19, 2023

Plug and play RAM percentage and icon indicator for Tmux

Shell 2 Updated Apr 15, 2022

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 5,040 158 Updated Jan 21, 2025

Use Google Docs like vim. Sorta.

JavaScript 113 14 Updated Sep 6, 2021

Train transformer language models with reinforcement learning.

Python 10,672 1,380 Updated Jan 22, 2025

A library for distributed ML training with PyTorch

C++ 366 22 Updated Dec 12, 2022

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

Jupyter Notebook 797 48 Updated Aug 12, 2024

DMControl Generalization Benchmark

Python 168 43 Updated Jan 3, 2024

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 5,492 337 Updated Jul 21, 2024

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Python 487 60 Updated Aug 19, 2024

PAIRED in PyTorch 🔥

Python 57 20 Updated Mar 8, 2023

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,708 161 Updated Aug 18, 2024
Next