- London
-
16:12
(UTC) - https://robertkirk.github.io/
- @_robertkirk
Highlights
- Pro
Stars
The nnsight package enables interpreting and manipulating the internals of deep learned models.
Inference algorithms for models based on Luce's choice axiom
Steering vectors for transformer language models in Pytorch / Huggingface
Code for the TinyStories experiments from "Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks".
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
A browser extension that deletes your news feed and replaces it with a nice quote
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
A library for mechanistic interpretability of GPT-style language models
🎢 Creating and sharing simulation environments for embodied and synthetic data research
A modular RL library to fine-tune language models to human preferences
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Mechanistic Interpretability for Transformer Models
Code for the paper Fine-Tuning Language Models from Human Preferences
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
ddebode / tmux-ram
Forked from RobertKirk/tmux-ramPlug and play RAM percentage and icon indicator for Tmux
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
Train transformer language models with reinforcement learning.
A library for distributed ML training with PyTorch
[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.
DMControl Generalization Benchmark
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).